Newer
Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Wed May 27 01:01:23 UTC 2020\n"
]
}
],
"source": [
"!date"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introduction\n",
"\n",
"In this part of the course, we will discuss how to quantify droplet-based single-cell RNA-seq data using _alevin_. We will cover the details about the various command-line flags used by the [_alevin_](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1670-y) tool in its indexing & quantification stages, and quantify a small subset data for the experiment done by [Hermann et. al](https://pubmed.ncbi.nlm.nih.gov/30404016/)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Reference Transcriptome\n",
"\n",
"Alevin uses the transcriptome-alignment strategy to generate the alignments of the dscRNA-seq reads.\n",
"Under the hood, alevin uses [Salmon's](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5600148/) [selective-alignment](https://www.biorxiv.org/content/10.1101/657874v2) infrastructure to generate the alignments and starts by first _indexing_ the reference transcriptome.\n",
"In this tutorial we will use a small reference transcriptome, which we generate by subsampling all the transcripts from the Chromosome 18 & 19\n",
"of the mouse transcriptome and it is already copied in your environment. \n",
"**NOTE**: A user can download the full transcriptome from https://www.gencodegenes.org/ .\n",
"\n",
"Let's first start by checking if we can access salmon and the required data through our environment. \n",
"**NOTE**: `!` enables the bash command mode for a line in the ipython notebook \n",
"**NOTE**: `%%bash` enables the bash command mode for the cell in the ipython notebook"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"salmon v1.2.1\n",
"\n",
"Usage: salmon -h|--help or \n",
" salmon -v|--version or \n",
" salmon -c|--cite or \n",
" salmon [--no-version-check] <COMMAND> [-h | options]\n",
"\n",
"Commands:\n",
" index Create a salmon index\n",
" quant Quantify a sample\n",
" alevin single cell analysis\n",
" swim Perform super-secret operation\n",
" quantmerge Merge multiple quantifications into a single file\n"
]
}
],
"source": [
"!salmon --help"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"AdultMouseRep3sub1M_S1_L001_R1_001.fastq.gz\n",
"AdultMouseRep3sub1M_S1_L001_R2_001.fastq.gz\n",
"GRCm38.gencode.vM21.chr18.chr19.genome.fa\n",
"GRCm38.gencode.vM21.chr18.chr19.gtf\n",
"GRCm38.gencode.vM21.chr18.chr19.tgMap.txt\n",
"GRCm38.gencode.vM21.chr18.chr19.txome.fa\n",
">ENSMUST00000234132.1|ENSMUSG00000117547.1|OTTMUSG00000072753.1|OTTMUST00000176063.1|AC125218.3-201|AC125218.3|252|processed_pseudogene|\n",
"CCTTAACCATAGGTACAGGTAATCAACTCAGAATGAAAAGCCAGTAGCTATGAACAAGGCGGAGGTGCCACTGCTAACCC\n",
"TGTGGCCACAGCACCCTTACCGCAGCTCTCAAGTGAGATTGAACGCCTCATGAGTCAGGGTTATTACTACCAGGACATTC\n",
"AGAAATCTCTGGTCATTGCCCAAAACAACATTGAGATTGCTAAAAACATCCTCCAGGAATTTGTTTCTATTTCTTCTCCT\n"
]
}
],
"source": [
"%%bash\n",
"ls data/spermatogenesis_subset\n",
"head -4 data/spermatogenesis_subset/GRCm38.gencode.vM21.chr18.chr19.txome.fa"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Salmon Indexing\n",
"\n",
"Indexing is the process by which salmon preprocess the reference sequences and store them into an efficient data-structure which is designed specifically to optimize the alignment speed & accuracy. Salmon follows a kmer-based indexing approach (more discussion to follow) which is enable by `salmon index` command. Understanding the command-line flags of a tool is very important to tweak the efficiency and customize the tool according to your usecase. Let's look into detail to some of the frequently used command-line flags and index the subsampled transcriptome."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Version Info: This is the most recent version of salmon.\n",
"\n",
"Index\n",
"==========\n",
"Creates a salmon index.\n",
"\n",
"Command Line Options:\n",
" -v [ --version ] print version string\n",
" -h [ --help ] produce help message\n",
" -t [ --transcripts ] arg Transcript fasta file.\n",
" -k [ --kmerLen ] arg (=31) The size of k-mers that should be used for the \n",
" quasi index.\n",
" -i [ --index ] arg salmon index.\n",
" --gencode This flag will expect the input transcript \n",
" fasta to be in GENCODE format, and will split \n",
" the transcript name at the first '|' character.\n",
" These reduced names will be used in the output \n",
" and when looking for these transcripts in a \n",
" gene to transcript GTF.\n",
" --features This flag will expect the input reference to be\n",
" in the tsv file format, and will split the \n",
" feature name at the first 'tab' character. \n",
" These reduced names will be used in the output \n",
" and when looking for the sequence of the \n",
" features.GTF.\n",
" --keepDuplicates This flag will disable the default indexing \n",
" behavior of discarding sequence-identical \n",
" duplicate transcripts. If this flag is passed,\n",
" then duplicate transcripts that appear in the \n",
" input will be retained and quantified \n",
" separately.\n",
" -p [ --threads ] arg (=2) Number of threads to use during indexing.\n",
" --keepFixedFasta Retain the fixed fasta file (without short \n",
" transcripts and duplicates, clipped, etc.) \n",
" generated during indexing\n",
" -f [ --filterSize ] arg (=-1) The size of the Bloom filter that will be used \n",
" by TwoPaCo during indexing. The filter will be \n",
" of size 2^{filterSize}. The default value of -1\n",
" means that the filter size will be \n",
" automatically set based on the number of \n",
" distinct k-mers in the input, as estimated by \n",
" nthll.\n",
" --tmpdir arg The directory location that will be used for \n",
" TwoPaCo temporary files; it will be created if \n",
" need be and be removed prior to indexing \n",
" completion. The default value will cause a \n",
" (temporary) subdirectory of the salmon index \n",
" directory to be used for this purpose.\n",
" --sparse Build the index using a sparse sampling of \n",
" k-mer positions This will require less memory \n",
" (especially during quantification), but will \n",
" take longer to construct and can slow down \n",
" mapping / alignment\n",
" -d [ --decoys ] arg Treat these sequences ids from the reference as\n",
" the decoys that may have sequence homologous to\n",
" some known transcript. for example in case of \n",
" the genome, provide a list of chromosome name \n",
" --- one per line\n",
" --type arg (=puff) The type of index to build; the only option is \n",
" \"puff\" in this version of salmon.\n",
"\n"
]
}
],
"source": [
"!salmon index --help"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Some papers about indexing reference sequences.\n",
"* rapmap paper: https://academic.oup.com/bioinformatics/article/32/12/i192/2288985\n",
"* pufferfish paper: https://academic.oup.com/bioinformatics/article/34/13/i169/5045749\n",
"* selective-alignment paper: https://www.biorxiv.org/content/10.1101/138800v2\n",
"\n",
"### Brief about kmer and indexing reference sequence.\n",
"* slide: https://github.com/fmicompbio/adv_scrnaseq_2020/blob/master/scrna-seq-quantification/exercise.pdf"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Version Info: This is the most recent version of salmon.\n",
"index [\"data/spermatogenesis_subset/salmon_index\"] did not previously exist . . . creating it\n",
"[2020-05-27 14:36:24.134] [jLog] [warning] The salmon index is being built without any decoy sequences. It is recommended that decoy sequence (either computed auxiliary decoy sequence or the genome of the organism) be provided during indexing. Further details can be found at https://salmon.readthedocs.io/en/latest/salmon.html#preparing-transcriptome-indices-mapping-based-mode.\n",
"[2020-05-27 14:36:24.135] [jLog] [info] building index\n",
"out : data/spermatogenesis_subset/salmon_index\n",
"\u001b[00m[2020-05-27 14:36:24.135] [puff::index::jointLog] [info] Running fixFasta\n",
"\u001b[00m\n",
"[Step 1 of 4] : counting k-mers\n",
"\n",
"\u001b[33m\u001b[1m[2020-05-27 14:36:24.876] [puff::index::jointLog] [warning] Removed 16 transcripts that were sequence duplicates of indexed transcripts.\n",
"\u001b[00m\u001b[33m\u001b[1m[2020-05-27 14:36:24.876] [puff::index::jointLog] [warning] If you wish to retain duplicate transcripts, please use the `--keepDuplicates` flag\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:24.877] [puff::index::jointLog] [info] Replaced 0 non-ATCG nucleotides\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:24.877] [puff::index::jointLog] [info] Clipped poly-A tails from 67 transcripts\n",
"\u001b[00mwrote 7879 cleaned references\n",
"\u001b[00m[2020-05-27 14:36:25.036] [puff::index::jointLog] [info] Filter size not provided; estimating from number of distinct k-mers\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:25.216] [puff::index::jointLog] [info] ntHll estimated 6976516 distinct k-mers, setting filter size to 2^27\n",
"\u001b[00mThreads = 2\n",
"Vertex length = 31\n",
"Hash functions = 5\n",
"Filter size = 134217728\n",
"Capacity = 2\n",
"Files: \n",
"data/spermatogenesis_subset/salmon_index/ref_k31_fixed.fa\n",
"--------------------------------------------------------------------------------\n",
"Round 0, 0:134217728\n",
"Pass\tFilling\tFiltering\n",
"1\t2\t5\t\n",
"2\t0\t0\n",
"True junctions count = 23456\n",
"False junctions count = 42601\n",
"Hash table size = 66057\n",
"Candidate marks count = 190374\n",
"--------------------------------------------------------------------------------\n",
"Reallocating bifurcations time: 0\n",
"True marks count: 121832\n",
"Edges construction time: 1\n",
"--------------------------------------------------------------------------------\n",
"Distinct junctions = 23456\n",
"\n",
"allowedIn: 21\n",
"Max Junction ID: 30505\n",
"seen.size():244049 kmerInfo.size():30506\n",
"approximateContigTotalLength: 5297227\n",
"counters for complex kmers:\n",
"(prec>1 & succ>1)=569 | (succ>1 & isStart)=20 | (prec>1 & isEnd)=14 | (isStart & isEnd)=2\n",
"contig count: 35630 element count: 8051951 complex nodes: 605\n",
"# of ones in rank vector: 35629\n",
"\u001b[00m[2020-05-27 14:36:34.483] [puff::index::jointLog] [info] Starting the Pufferfish indexing by reading the GFA binary file.\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:34.483] [puff::index::jointLog] [info] Setting the index/BinaryGfa directory data/spermatogenesis_subset/salmon_index\n",
"\u001b[00msize = 8051951\n",
"-----------------------------------------\n",
"| Loading contigs | Time = 905.94 us\n",
"-----------------------------------------\n",
"size = 8051951\n",
"-----------------------------------------\n",
"| Loading contig boundaries | Time = 423.51 us\n",
"-----------------------------------------\n",
"Number of ones: 35636\n",
"Number of ones per inventory item: 512\n",
"Inventory entries filled: 70\n",
"35629\n",
"\u001b[00m[2020-05-27 14:36:34.506] [puff::index::jointLog] [info] Done wrapping the rank vector with a rank9sel structure.\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:34.506] [puff::index::jointLog] [info] contig count for validation: 35,629\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:34.520] [puff::index::jointLog] [info] Total # of Contigs : 35,629\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:34.520] [puff::index::jointLog] [info] Total # of numerical Contigs : 35,629\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:34.521] [puff::index::jointLog] [info] Total # of contig vec entries: 115,983\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:34.521] [puff::index::jointLog] [info] bits per offset entry 17\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:34.524] [puff::index::jointLog] [info] Done constructing the contig vector. 35630\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:34.537] [puff::index::jointLog] [info] # segments = 35,629\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:34.537] [puff::index::jointLog] [info] total length = 8,051,951\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:34.542] [puff::index::jointLog] [info] Reading the reference files ...\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:34.618] [puff::index::jointLog] [info] positional integer width = 23\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:34.618] [puff::index::jointLog] [info] seqSize = 8,051,951\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:34.618] [puff::index::jointLog] [info] rankSize = 8,051,951\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:34.618] [puff::index::jointLog] [info] edgeVecSize = 0\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:34.618] [puff::index::jointLog] [info] num keys = 6,983,081\n",
"\u001b[00mfor info, total work write each : 2.331 total work inram from level 3 : 4.322 total work raw : 25.000 \n",
"[Building BooPHF] 100 % elapsed: 0 min 1 sec remaining: 0 min 0 sec\n",
"Bitarray 36594880 bits (100.00 %) (array + ranks )\n",
"final hash 0 bits (0.00 %) (nb in final hash 0)\n",
"\u001b[00m[2020-05-27 14:36:35.274] [puff::index::jointLog] [info] mphf size = 4.36245 MB\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:35.274] [puff::index::jointLog] [info] chunk size = 4,025,976\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:35.274] [puff::index::jointLog] [info] chunk 0 = [0, 4,025,976)\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:35.274] [puff::index::jointLog] [info] chunk 1 = [4,025,976, 8,051,921)\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:36.332] [puff::index::jointLog] [info] finished populating pos vector\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:36.332] [puff::index::jointLog] [info] writing index components\n",
"\u001b[00m\u001b[00m[2020-05-27 14:36:36.413] [puff::index::jointLog] [info] finished writing dense pufferfish index\n",
"\u001b[00m[2020-05-27 14:36:36.418] [jLog] [info] done building index\n"
]
}
],
"source": [
"! salmon index -t data/spermatogenesis_subset/GRCm38.gencode.vM21.chr18.chr19.txome.fa -k 31 -i data/spermatogenesis_subset/salmon_index --gencode -p 2 "
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"complete_ref_lens.bin\tinfo.json\t rank.bin\t refseq.bin\n",
"ctable.bin\t\tmphf.bin\t refAccumLengths.bin seq.bin\n",
"ctg_offsets.bin\t\tpos.bin\t\t ref_indexing.log versionInfo.json\n",
"duplicate_clusters.tsv\tpre_indexing.log reflengths.bin\n"
]
}
],
"source": [
"! ls data/spermatogenesis_subset/salmon_index"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"RetainedRef\tDuplicateRef\n",
"ENSMUST00000198203.1\tENSMUST00000199618.1\n",
"ENSMUST00000235145.1\tENSMUST00000237994.1\n",
"ENSMUST00000236485.1\tENSMUST00000237580.1\n"
]
}
],
"source": [
"! head -4 data/spermatogenesis_subset/salmon_index/duplicate_clusters.tsv"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Understanding the Input data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Droplet-based single-cell sequencing experiments like Drop-seq, 10x Chromium, typically generate a set of paired-end (PE) FASTQ file. Based on the requirements of an experiment, a library is generated with fixed Cellular Barcode (CB) and UMI length, typically 16 & 10 for 10x V2, 16 & 12 for 10x V3 and 14 & 10 for Drop-seq single-cell protocol. \n",
"The PE FASTQ files are generated in a set of two files, typically recognized through `R1` and `R2` tags in their name. `R1` file contains the concatenated sequence of CB & UMI while `R2` file contains the transcript read sequence. "
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"@J00167:56:HK2GNBBXX:6:1227:13352:9684 1:N:0:0\n",
"TTGACTTGTGAGGGAGTGCCCTGCTG\n",
"+\n",
"AAFFFJJJJJJJJJJJJJJJJJJJJJ\n",
"\n",
"gzip: stdout: Broken pipe\n"
]
}
],
"source": [
"!zcat data/spermatogenesis_subset/AdultMouseRep3sub1M_S1_L001_R1_001.fastq.gz | head -4 "
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"@J00167:56:HK2GNBBXX:6:1227:13352:9684 3:N:0:0\n",
"AGAAGAGCCTGGACAGATGTTATACAGACACTAAGAGAACACAAATTCCAGCCCAGGCTACTATACCCAGCCAACTCTCAATTACCATAGATGGAGAAAC\n",
"+\n",
"AAFFFJJJJJJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJJ<JJJJJJJJJJJJJJJJJJJJJJJFJJFFJJFFJJJJJJJJJJJ\n",
"\n",
"gzip: stdout: Broken pipe\n"
]
}
],
"source": [
"!zcat data/spermatogenesis_subset/AdultMouseRep3sub1M_S1_L001_R2_001.fastq.gz | head -4 "
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"ENSMUST00000234132.1\tAC125218.3\n",
"ENSMUST00000176956.1\tVmn1r-ps151\n",
"ENSMUST00000176452.1\tVmn1r-ps152\n",
"ENSMUST00000234774.1\tAC125218.2\n"
]
}
],
"source": [
"! head -4 data/spermatogenesis_subset/GRCm38.gencode.vM21.chr18.chr19.tgMap.txt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## dscRNA-seq Quantification w/ alevin\n",
"\n",
"As we now have basic understanding of some of the inputs required by alevin for the quantification of dscRNA-seq data, let's take a deeper dive into some of frequently used command-line flag (options) for the `salmon alevin` command."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### some useful links\n",
"* libtype: https://salmon.readthedocs.io/en/latest/salmon.html#what-s-this-libtype\n",
"* single-cell protocol type: https://github.com/COMBINE-lab/salmon/blob/master/include/SingleCellProtocols.hpp#L28-L84"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Version Info: This is the most recent version of salmon.\n",
"\n",
"alevin\n",
"==========\n",
"salmon-based processing of single-cell RNA-seq data.\n",
"\n",
"alevin options:\n",
"\n",
"\n",
"mapping input options:\n",
" -l [ --libType ] arg Format string describing the library \n",
" type\n",
" -i [ --index ] arg salmon index\n",
" -r [ --unmatedReads ] arg List of files containing unmated reads \n",
" of (e.g. single-end reads)\n",
" -1 [ --mates1 ] arg File containing the #1 mates\n",
" -2 [ --mates2 ] arg File containing the #2 mates\n",
"\n",
"\n",
"alevin-specific Options:\n",
" -v [ --version ] print version string\n",
" -h [ --help ] produce help message\n",
" -o [ --output ] arg Output quantification directory.\n",
" -p [ --threads ] arg (=2) The number of threads to use \n",
" concurrently.\n",
" --tgMap arg transcript to gene map tsv file\n",
" --hash arg Secondary input point for Alevin using \n",
" Big freaking Hash (bfh.txt) file. Works\n",
" Only with --chromium\n",
" --dropseq Use DropSeq Single Cell protocol for \n",
" the library\n",
" --chromiumV3 Use 10x chromium v3 Single Cell \n",
" protocol for the library.\n",
" --chromium Use 10x chromium v2 Single Cell \n",
" protocol for the library.\n",
" --gemcode Use 10x gemcode v1 Single Cell protocol\n",
" for the library.\n",
" --citeseq Use CITESeq Single Cell protocol for \n",
" the library, 16 CB, 12 UMI and \n",
" features.\n",
" --celseq Use CEL-Seq Single Cell protocol for \n",
" the library.\n",
" --celseq2 Use CEL-Seq2 Single Cell protocol for \n",
" the library.\n",
" --quartzseq2 Use Quartz-Seq2 v3.2 Single Cell \n",
" protocol for the library assumes 15 \n",
" length barcode and 8 length UMI.\n",
" --whitelist arg File containing white-list barcodes\n",
" --featureStart arg This flag should be used with citeseq \n",
" and specifies the starting index of the\n",
" feature barcode on Read2.\n",
" --featureLength arg This flag should be used with citeseq \n",
" and specifies the length of the feature\n",
" barcode.\n",
" --noQuant Don't run downstream barcode-salmon \n",
" model.\n",
" --numCellBootstraps arg (=0) Generate mean and variance for cell x \n",
" gene matrix quantification estimates.\n",
" --forceCells arg (=0) Explicitly specify the number of cells.\n",
" --expectCells arg (=0) define a close upper bound on expected \n",
" number of cells\n",
" --mrna arg path to a file containing mito-RNA \n",
" gene, one per line\n",
" --rrna arg path to a file containing ribosomal \n",
" RNA, one per line\n",
" --keepCBFraction arg (=0) fraction of CB to keep, value must be \n",
" in range (0,1], use 1 to quantify all \n",
" CB.\n",
" --end arg Cell-Barcodes end (5 or 3) location in \n",
" the read sequence from where barcode \n",
" has tobe extracted. (end, umiLength, \n",
" barcodeLength) should all be provided \n",
" if using this option\n",
" --umiLength arg umi length Parameter for unknown \n",
" protocol. (end, umiLength, \n",
" barcodeLength) should all be provided \n",
" if using this option\n",
" --barcodeLength arg umi length Parameter for unknown \n",
" protocol. (end, umiLength, \n",
" barcodeLength) should all be provided \n",
" if using this option\n",
" --noem do not run em\n",
" --freqThreshold arg (=10) threshold for the frequency of the \n",
" barcodes\n",
" --umiEditDistance arg (=1) Maximum allowble edit distance to \n",
" collapse UMIs, Expect delay in running \n",
" time if != 1\n",
" --dumpfq Dump barcode modified fastq file for \n",
" downstream analysis by using coin toss \n",
" for multi-mapping.\n",
" --dumpBfh dump the big hash with all the barcodes\n",
" and the UMI sequence.\n",
" --dumpArborescences dump the gene-v-cell matrix for the \n",
" total number of fragments used in the \n",
" UMI deduplicaiton.\n",
" --dumpUmiGraph dump the per cell level Umi Graph.\n",
" --dumpFeatures Dump features for whitelist and \n",
" downstream analysis.\n",
" --dumpMtx Dump cell v transcripts count matrix in\n",
" sparse mtx format.\n",
" --lowRegionMinNumBarcodes arg (=200) Minimum Number of CB to use for \n",
" learning Low confidence region \n",
" (Default: 200).\n",
" --maxNumBarcodes arg (=100000) Maximum allowable limit to process the \n",
" cell barcodes. (Default: 100000)\n",
"\n"
]
}
],
"source": [
"! salmon alevin --help"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Version Info: This is the most recent version of salmon.\n",
"Logs will be written to data/spermatogenesis_subset/alevin_output/logs\n",
"\u001b[00m[2020-05-27 15:04:51.366] [jointLog] [info] setting maxHashResizeThreads to 2\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:51.366] [jointLog] [info] Fragment incompatibility prior below threshold. Incompatible fragments will be ignored.\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:51.366] [jointLog] [info] The --mimicBT2, --mimicStrictBT2 and --hardFilter flags imply mapping validation (--validateMappings). Enabling mapping validation.\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:51.366] [jointLog] [info] Usage of --validateMappings implies use of minScoreFraction. Since not explicitly specified, it is being set to 0.65\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:51.366] [jointLog] [info] The use of range-factorized equivalence classes does not make sense in conjunction with --hardFilter. Disabling range-factorized equivalence classes. \n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:51.366] [jointLog] [info] Usage of --validateMappings implies a default consensus slack of 0.2. Setting consensusSlack to 0.35.\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:51.366] [jointLog] [info] Using default value of 0.87 for minScoreFraction in Alevin\n",
"Using default value of 0.6 for consensusSlack in Alevin\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:51.384] [alevinLog] [info] Found 7879 transcripts(+0 decoys, +0 short and +0 duplicate names in the index)\n",
"\u001b[00m### alevin (dscRNA-seq quantification) v1.2.1\n",
"### [ program ] => salmon \n",
"### [ command ] => alevin \n",
"### [ libType ] => { ISR }\n",
"### [ mates1 ] => { data/spermatogenesis_subset/AdultMouseRep3sub1M_S1_L001_R1_001.fastq.gz }\n",
"### [ mates2 ] => { data/spermatogenesis_subset/AdultMouseRep3sub1M_S1_L001_R2_001.fastq.gz }\n",
"### [ chromium ] => { }\n",
"### [ index ] => { data/spermatogenesis_subset/salmon_index }\n",
"### [ threads ] => { 2 }\n",
"### [ output ] => { data/spermatogenesis_subset/alevin_output }\n",
"### [ tgMap ] => { data/spermatogenesis_subset/GRCm38.gencode.vM21.chr18.chr19.tgMap.txt }\n",
"### [ expectCells ] => { 1000 }\n",
"\n",
"\n",
"\u001b[00m[2020-05-27 15:04:51.444] [alevinLog] [info] Filled with 7879 txp to gene entries \n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:51.445] [alevinLog] [info] Found all transcripts to gene mappings\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:51.561] [alevinLog] [info] Processing barcodes files (if Present) \n",
"\n",
" \n",
"\u001b[32mprocessed\u001b[31m 1 Million \u001b[32mbarcodes\u001b[0m\n",
"\n",
"\u001b[00m[2020-05-27 15:04:53.462] [alevinLog] [info] Done barcode density calculation.\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:53.462] [alevinLog] [info] # Barcodes Used: \u001b[32m1000000\u001b[0m / \u001b[31m1000000\u001b[0m.\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:53.561] [alevinLog] [info] Total \u001b[32m2103\u001b[0m(has \u001b[32m699\u001b[0m low confidence) barcodes\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:53.691] [alevinLog] [info] Done True Barcode Sampling\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:53.715] [alevinLog] [info] Total 41.4697% reads will be thrown away because of noisy Cellular barcodes.\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:53.834] [alevinLog] [info] Done populating Z matrix\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:53.834] [alevinLog] [info] Total 86 CB got sequence corrected\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:53.834] [alevinLog] [info] Done indexing Barcodes\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:53.834] [alevinLog] [info] Total Unique barcodes found: 73076\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:53.834] [alevinLog] [info] Used Barcodes except Whitelist: 86\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:53.856] [alevinLog] [info] Done with Barcode Processing; Moving to Quantify\n",
"\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:53.856] [alevinLog] [info] parsing read library format\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:53.856] [jointLog] [info] There is 1 library.\n",
"\u001b[00m-----------------------------------------\n",
"| Loading contig table | Time = 106.78 ms\n",
"-----------------------------------------\n",
"size = 35630\n",
"-----------------------------------------\n",
"| Loading contig offsets | Time = 25.398 ms\n",
"-----------------------------------------\n",
"-----------------------------------------\n",
"| Loading reference lengths | Time = 8.8671 ms\n",
"-----------------------------------------\n",
"-----------------------------------------\n",
"| Loading mphf table | Time = 119.65 ms\n",
"-----------------------------------------\n",
"\u001b[00m[2020-05-27 15:04:54.966] [jointLog] [info] Loading pufferfish index\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:54.977] [jointLog] [info] Loading dense pufferfish index.\n",
"\u001b[00msize = 8051951\n",
"Number of ones: 35636\n",
"Number of ones per inventory item: 512\n",
"Inventory entries filled: 70\n",
"-----------------------------------------\n",
"| Loading contig boundaries | Time = 54.41 ms\n",
"-----------------------------------------\n",
"size = 8051951\n",
"-----------------------------------------\n",
"| Loading sequence | Time = 66.411 ms\n",
"-----------------------------------------\n",
"size = 6983081\n",
"-----------------------------------------\n",
"| Loading positions | Time = 337.98 ms\n",
"-----------------------------------------\n",
"size = 13593212\n",
"-----------------------------------------\n",
"| Loading reference sequence | Time = 124.57 ms\n",
"-----------------------------------------\n",
"-----------------------------------------\n",
"| Loading reference accumulative lengths | Time = 19.051 ms\n",
"-----------------------------------------\n",
"\u001b[00m[2020-05-27 15:04:55.862] [jointLog] [info] done\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:55.862] [jointLog] [info] Index contained 7,879 targets\n",
"\u001b[00m\u001b[00m[2020-05-27 15:04:55.893] [jointLog] [info] Number of decoys : 0\n",
"\u001b[00m\n",
"\n",
"\n",
"\n",
"\u001b[32mprocessed\u001b[31m 0 Million \u001b[32mfragments\u001b[0m\n",
"\u001b[32mprocessed\u001b[31m 1 Million \u001b[32mfragments\u001b[0m\n",
"hits: 1724, hits per frag: 0.00173266\n",
"\n",
"\n",
"\n",
"\u001b[00m[2020-05-27 15:05:03.681] [jointLog] [info] Computed 75 rich equivalence classes for further processing\n",
"\u001b[00m\u001b[00m[2020-05-27 15:05:03.681] [jointLog] [info] Counted 551 total reads in the equivalence classes \n",
"\u001b[00m\u001b[00m[2020-05-27 15:05:03.681] [jointLog] [info] Number of fragments discarded because they are best-mapped to decoys : 0\n",
"\u001b[00m\u001b[00m[2020-05-27 15:05:03.682] [jointLog] [info] Mapping rate = 0.0551%\n",
"\n",
"\u001b[00m\u001b[00m[2020-05-27 15:05:03.682] [jointLog] [info] finished quantifyLibrary()\n",
"\u001b[00m\u001b[00m[2020-05-27 15:05:03.696] [alevinLog] [info] Starting optimizer\n",
"\n",
"\n",
"\u001b[32mAnalyzed 1 cells (\u001b[31m0%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 2 cells (\u001b[31m0%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 3 cells (\u001b[31m0%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 5 cells (\u001b[31m0%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 6 cells (\u001b[31m0%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 9 cells (\u001b[31m0%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 11 cells (\u001b[31m1%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 13 cells (\u001b[31m1%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 15 cells (\u001b[31m1%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 20 cells (\u001b[31m1%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 21 cells (\u001b[31m1%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 24 cells (\u001b[31m1%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 28 cells (\u001b[31m1%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 29 cells (\u001b[31m1%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 31 cells (\u001b[31m1%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 33 cells (\u001b[31m2%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 37 cells (\u001b[31m2%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 38 cells (\u001b[31m2%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 40 cells (\u001b[31m2%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 41 cells (\u001b[31m2%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 43 cells (\u001b[31m2%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 46 cells (\u001b[31m2%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 47 cells (\u001b[31m2%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 49 cells (\u001b[31m2%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 54 cells (\u001b[31m3%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 56 cells (\u001b[31m3%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 58 cells (\u001b[31m3%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 60 cells (\u001b[31m3%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 64 cells (\u001b[31m3%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 65 cells (\u001b[31m3%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 67 cells (\u001b[31m3%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 70 cells (\u001b[31m3%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 71 cells (\u001b[31m3%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 73 cells (\u001b[31m3%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 76 cells (\u001b[31m4%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 78 cells (\u001b[31m4%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 80 cells (\u001b[31m4%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 81 cells (\u001b[31m4%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 86 cells (\u001b[31m4%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 90 cells (\u001b[31m4%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 91 cells (\u001b[31m4%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 93 cells (\u001b[31m4%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 96 cells (\u001b[31m5%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 102 cells (\u001b[31m5%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 104 cells (\u001b[31m5%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 106 cells (\u001b[31m5%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 109 cells (\u001b[31m5%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 113 cells (\u001b[31m5%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 114 cells (\u001b[31m5%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 116 cells (\u001b[31m6%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 117 cells (\u001b[31m6%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 119 cells (\u001b[31m6%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 121 cells (\u001b[31m6%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 129 cells (\u001b[31m6%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 134 cells (\u001b[31m6%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 135 cells (\u001b[31m6%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 136 cells (\u001b[31m6%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 141 cells (\u001b[31m7%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 146 cells (\u001b[31m7%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 150 cells (\u001b[31m7%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 151 cells (\u001b[31m7%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 152 cells (\u001b[31m7%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 155 cells (\u001b[31m7%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 156 cells (\u001b[31m7%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 157 cells (\u001b[31m7%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 160 cells (\u001b[31m8%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 164 cells (\u001b[31m8%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 167 cells (\u001b[31m8%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 169 cells (\u001b[31m8%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 170 cells (\u001b[31m8%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 171 cells (\u001b[31m8%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 176 cells (\u001b[31m8%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 182 cells (\u001b[31m9%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 184 cells (\u001b[31m9%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 186 cells (\u001b[31m9%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 189 cells (\u001b[31m9%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 190 cells (\u001b[31m9%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 193 cells (\u001b[31m9%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 195 cells (\u001b[31m9%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 197 cells (\u001b[31m9%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 199 cells (\u001b[31m9%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 200 cells (\u001b[31m10%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 204 cells (\u001b[31m10%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 207 cells (\u001b[31m10%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 208 cells (\u001b[31m10%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 209 cells (\u001b[31m10%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 213 cells (\u001b[31m10%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 216 cells (\u001b[31m10%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 218 cells (\u001b[31m10%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 224 cells (\u001b[31m11%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 229 cells (\u001b[31m11%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 230 cells (\u001b[31m11%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 232 cells (\u001b[31m11%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 238 cells (\u001b[31m11%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 239 cells (\u001b[31m11%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 241 cells (\u001b[31m11%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 242 cells (\u001b[31m12%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 248 cells (\u001b[31m12%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 249 cells (\u001b[31m12%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 250 cells (\u001b[31m12%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 254 cells (\u001b[31m12%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 264 cells (\u001b[31m13%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 266 cells (\u001b[31m13%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 272 cells (\u001b[31m13%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 280 cells (\u001b[31m13%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 300 cells (\u001b[31m14%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 301 cells (\u001b[31m14%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 304 cells (\u001b[31m14%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 305 cells (\u001b[31m15%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 306 cells (\u001b[31m15%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 307 cells (\u001b[31m15%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 308 cells (\u001b[31m15%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 310 cells (\u001b[31m15%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 314 cells (\u001b[31m15%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 317 cells (\u001b[31m15%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 319 cells (\u001b[31m15%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 320 cells (\u001b[31m15%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 324 cells (\u001b[31m15%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 327 cells (\u001b[31m16%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 328 cells (\u001b[31m16%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 332 cells (\u001b[31m16%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 333 cells (\u001b[31m16%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 338 cells (\u001b[31m16%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 340 cells (\u001b[31m16%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 343 cells (\u001b[31m16%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 347 cells (\u001b[31m17%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 349 cells (\u001b[31m17%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 350 cells (\u001b[31m17%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 359 cells (\u001b[31m17%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 363 cells (\u001b[31m17%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 367 cells (\u001b[31m17%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 373 cells (\u001b[31m18%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 374 cells (\u001b[31m18%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 376 cells (\u001b[31m18%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 380 cells (\u001b[31m18%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 383 cells (\u001b[31m18%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 393 cells (\u001b[31m19%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 395 cells (\u001b[31m19%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 398 cells (\u001b[31m19%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 400 cells (\u001b[31m19%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 401 cells (\u001b[31m19%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 405 cells (\u001b[31m19%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 406 cells (\u001b[31m19%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 407 cells (\u001b[31m19%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 420 cells (\u001b[31m20%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 421 cells (\u001b[31m20%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 427 cells (\u001b[31m20%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 430 cells (\u001b[31m20%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 432 cells (\u001b[31m21%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 435 cells (\u001b[31m21%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 441 cells (\u001b[31m21%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 443 cells (\u001b[31m21%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 444 cells (\u001b[31m21%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 445 cells (\u001b[31m21%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 450 cells (\u001b[31m21%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 451 cells (\u001b[31m21%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 467 cells (\u001b[31m22%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 470 cells (\u001b[31m22%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 472 cells (\u001b[31m22%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 473 cells (\u001b[31m23%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 480 cells (\u001b[31m23%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 481 cells (\u001b[31m23%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 483 cells (\u001b[31m23%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 484 cells (\u001b[31m23%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 492 cells (\u001b[31m23%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 494 cells (\u001b[31m24%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 505 cells (\u001b[31m24%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 506 cells (\u001b[31m24%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 512 cells (\u001b[31m24%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 515 cells (\u001b[31m25%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 523 cells (\u001b[31m25%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 525 cells (\u001b[31m25%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 531 cells (\u001b[31m25%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 534 cells (\u001b[31m25%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 542 cells (\u001b[31m26%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 545 cells (\u001b[31m26%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 547 cells (\u001b[31m26%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 552 cells (\u001b[31m26%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 560 cells (\u001b[31m27%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 563 cells (\u001b[31m27%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 566 cells (\u001b[31m27%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 569 cells (\u001b[31m27%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 573 cells (\u001b[31m27%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 575 cells (\u001b[31m27%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 586 cells (\u001b[31m28%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 591 cells (\u001b[31m28%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 597 cells (\u001b[31m28%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 598 cells (\u001b[31m28%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 601 cells (\u001b[31m29%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 603 cells (\u001b[31m29%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 608 cells (\u001b[31m29%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 609 cells (\u001b[31m29%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 610 cells (\u001b[31m29%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 612 cells (\u001b[31m29%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 613 cells (\u001b[31m29%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 614 cells (\u001b[31m29%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 615 cells (\u001b[31m29%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 616 cells (\u001b[31m29%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 618 cells (\u001b[31m29%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 633 cells (\u001b[31m30%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 659 cells (\u001b[31m31%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 660 cells (\u001b[31m31%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 662 cells (\u001b[31m31%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 663 cells (\u001b[31m32%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 664 cells (\u001b[31m32%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 668 cells (\u001b[31m32%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 671 cells (\u001b[31m32%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 673 cells (\u001b[31m32%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 674 cells (\u001b[31m32%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 695 cells (\u001b[31m33%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 702 cells (\u001b[31m33%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 704 cells (\u001b[31m33%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 710 cells (\u001b[31m34%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 711 cells (\u001b[31m34%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 719 cells (\u001b[31m34%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 722 cells (\u001b[31m34%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 730 cells (\u001b[31m35%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 734 cells (\u001b[31m35%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 739 cells (\u001b[31m35%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 741 cells (\u001b[31m35%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 743 cells (\u001b[31m35%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 748 cells (\u001b[31m36%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 752 cells (\u001b[31m36%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 756 cells (\u001b[31m36%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 758 cells (\u001b[31m36%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 759 cells (\u001b[31m36%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 760 cells (\u001b[31m36%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 771 cells (\u001b[31m37%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 780 cells (\u001b[31m37%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 781 cells (\u001b[31m37%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 784 cells (\u001b[31m37%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 787 cells (\u001b[31m37%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 797 cells (\u001b[31m38%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 798 cells (\u001b[31m38%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 807 cells (\u001b[31m38%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 825 cells (\u001b[31m39%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 833 cells (\u001b[31m40%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 835 cells (\u001b[31m40%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 836 cells (\u001b[31m40%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 837 cells (\u001b[31m40%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 840 cells (\u001b[31m40%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 842 cells (\u001b[31m40%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 845 cells (\u001b[31m40%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 849 cells (\u001b[31m40%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 850 cells (\u001b[31m40%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 858 cells (\u001b[31m41%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 868 cells (\u001b[31m41%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 869 cells (\u001b[31m41%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 871 cells (\u001b[31m41%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 881 cells (\u001b[31m42%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 884 cells (\u001b[31m42%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 890 cells (\u001b[31m42%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 893 cells (\u001b[31m42%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 903 cells (\u001b[31m43%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 905 cells (\u001b[31m43%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 909 cells (\u001b[31m43%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 914 cells (\u001b[31m43%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 915 cells (\u001b[31m44%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 917 cells (\u001b[31m44%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 921 cells (\u001b[31m44%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 941 cells (\u001b[31m45%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 953 cells (\u001b[31m45%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 969 cells (\u001b[31m46%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 975 cells (\u001b[31m46%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 991 cells (\u001b[31m47%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 992 cells (\u001b[31m47%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1001 cells (\u001b[31m48%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1004 cells (\u001b[31m48%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1009 cells (\u001b[31m48%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1015 cells (\u001b[31m48%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1019 cells (\u001b[31m48%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1038 cells (\u001b[31m49%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1043 cells (\u001b[31m50%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1050 cells (\u001b[31m50%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1084 cells (\u001b[31m52%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1090 cells (\u001b[31m52%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1097 cells (\u001b[31m52%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1106 cells (\u001b[31m53%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1111 cells (\u001b[31m53%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1133 cells (\u001b[31m54%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1135 cells (\u001b[31m54%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1136 cells (\u001b[31m54%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1146 cells (\u001b[31m55%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1162 cells (\u001b[31m55%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1208 cells (\u001b[31m57%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1210 cells (\u001b[31m58%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1215 cells (\u001b[31m58%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1224 cells (\u001b[31m58%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1228 cells (\u001b[31m58%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1249 cells (\u001b[31m59%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1276 cells (\u001b[31m61%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1282 cells (\u001b[31m61%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1315 cells (\u001b[31m63%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1391 cells (\u001b[31m66%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1404 cells (\u001b[31m67%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1417 cells (\u001b[31m67%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1418 cells (\u001b[31m67%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1427 cells (\u001b[31m68%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1428 cells (\u001b[31m68%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1430 cells (\u001b[31m68%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1442 cells (\u001b[31m69%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1447 cells (\u001b[31m69%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1472 cells (\u001b[31m70%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1474 cells (\u001b[31m70%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1475 cells (\u001b[31m70%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1482 cells (\u001b[31m71%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1485 cells (\u001b[31m71%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1502 cells (\u001b[31m71%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1511 cells (\u001b[31m72%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1519 cells (\u001b[31m72%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1520 cells (\u001b[31m72%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1522 cells (\u001b[31m72%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1524 cells (\u001b[31m73%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1546 cells (\u001b[31m74%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1550 cells (\u001b[31m74%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1561 cells (\u001b[31m74%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1577 cells (\u001b[31m75%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1584 cells (\u001b[31m75%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1613 cells (\u001b[31m77%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1633 cells (\u001b[31m78%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1666 cells (\u001b[31m79%\u001b[32m of all).\u001b[0m\n",
"\u001b[32mAnalyzed 1711 cells (\u001b[31m81%\u001b[32m of all).\u001b[0m\n",