Jelajahi Sumber

Update readme.md

Mike Mikailov 5 tahun lalu
induk
melakukan
1b8619d5c6
1 mengubah file dengan 7 tambahan dan 144 penghapusan
  1. 7 144
      readme.md

+ 7 - 144
readme.md

@@ -1,6 +1,7 @@
 [1 Image patch extraction](#1-image-patch-extraction)   
 [2 Prediction](#2-prediction)  
 [3 Heatmap stitching](#3-heatmap-stitching)
+[4 Retreiving run-time statistics](#4-Retreiving-run-time-statistics)
 
 # 1 Image patch extraction
 The following commands launch Son of Grid Engine (SGE) jobs to extract, group patches in HDF5 files and create a lookup tables for every HDF5 file. 
@@ -34,152 +35,14 @@ qsub process_main.sh ./config_normal.txt
 qsub process_main.sh ./config_tumor.txt  
 
 # 3 Heatmap stitching
-After the predictions matrices have been generated the following SGE job could be launched to genertae heatmaps.
+After the predictions matrices have been generated an SGE job using heatmap_main.sh SGE scrip could be launched to genertae heatmaps. Two arguments for this launch are: a) type of the slides (test, normal or tumor); b) the root directory of the results, like in below ecxample run:  
 qsub heatmap_main.sh test /scratch/mikem/UserSupport/weizhe.li/runs_process_cn_True/testing_wnorm_448_400_7690953
 
+# 4 Retreiving run-time statistics
+In time_all_stats_pred.sh file adjust job results root directory, like:  
+DIR=/scratch/mikem/UserSupport/weizhe.li/runs_process_cn_False/normal_wnorm_448_400_7691563  
+Then run:
+time bash ./time_all_stats_pred.sh  
 
 
-# TESTING
-[mikem@betsy02 split_wsi]$ ls -alsh /scratch/wxc4/CAMELYON16-testing | less
-total 206G
-[mikem@betsy02 split_wsi]$ ls -1 /scratch/wxc4/CAMELYON16-testing | wc -l
-129
 
-# TUMOR
-[mikem@betsy02 split_wsi]$ ls -alsh /scratch/wxc4/CAMELYON16-training/tumor | less
-total 219G
-[mikem@betsy02 split_wsi]$ ls -1 /scratch/wxc4/CAMELYON16-training/tumor | wc -l
-111
-
-# NORMAL
-[mikem@betsy02 split_wsi]$ ls -alsh /scratch/wxc4/CAMELYON16-training/normal | less
-total 278G
-[mikem@betsy02 split_wsi]$ ls -1 /scratch/wxc4/CAMELYON16-training/normal | wc -l
-159
-
-
-location of the bouding box
-
-    dimensions = {'normal' : '/home/weizhe.li/li-code4hpc/pred_dim_0314/training-updated/normal/dimensions',
-                  'tumor' : '/home/weizhe.li/li-code4hpc/pred_dim_0314/training-updated/tumor/dimensions',
-                  'test' : '/home/weizhe.li/li-code4hpc/pred_dim_0314/testing/dimensions'  
-        }
-slide.dimensions
-A (width, height) tuple for level 0 of the slide.
-
-get_tile_dimensions(level, address)
-    Return a (pixels_x, pixels_y) tuple for the specified tile.
-
-get_tile_coordinates(level, address)
-Return the OpenSlide.read_region() arguments corresponding to the specified tile.
-read_region(location, level, size)
-Return an RGBA Image containing the contents of the specified region.
-•llocation (tuple) – (x, y) tuple giving the top left pixel in the level 0 reference frame
-•level (int) – the level number
-•size (tuple) – (width, height) tuple giving the region size
-
-
->>> crds
-((96, 0), 0, (512, 288))
->>> crds[0]
-(96, 0)
->>> crds[0][0]
-96
->>> crds[2]
-(512, 288)
->>> crds[2][0]
-512
->>> crds[2][1]
-288
->>> crds[1]
-0
-
-The numbers of dimension times 224 is the actually dimension for the highest resolution image. 
-
-The dimension from openslide is a coordinate (x, y) x is the width and y is the height.
-If the image was read into a numpy array. When you check the numpy array shape, you will get another coordinate (x, y) x is the height, y is the width. Please note that the sequence is reversed. Please use the description of the dimension file in this email. 
-By the way, the openslide use the top left corner as the (0, 0) point. X axis poinst to right; Y axis points down
-The dimension files I mentioned in my last email have their contents as the following:
-
-Each file store a list: [item1, item2, item3, item4, item5, item6, item7, item8]  br:  4256   5152   4256   5152
-
-item1:  height of the WSI image
-item2:  width of the WSI image
-item3: number of channels ( equal to 3. We don’t need this item)
-item4:  x coordinate on left of bounding box;
-item5:  x coordinate on right of bounding box;
-item6: y coordinate on top of bounding box;
-item7: y coordinate on bottom of bounding box;
-item8: height of the bounding box of bounding box;
-item9: width of the bounding box of bounding box.
-
-
-=====
-
-Each file stores a list of bounding box. The actually coordinate on highest resolution needs to be timed by 224. I will send you a description tomorrow morning.
-
-The dimension files I mentioned in my last email have their contents as the following:
-
-Each file store a list: [item1, item2, item3, item4, item5, item6, item7, item8]
-
-item1:  width of the WSI image
-item2:  height of the WSI image
-item3: number of channels ( equal to 3. We don’t need this item)
-item4:  x coordinate on left of bounding box;
-item5:  x coordinate on right of bounding box;
-item6: y coordinate on top of bounding box;
-item7: y coordinate on bottom of bounding box;
-item8: width of the bounding box of bounding box;
-item9: height of the bounding box of bounding box.
-
-
-
-
-Level counting
-qsub get_levels.sh
-cat level_count/* >> all_level_count.csv
-sort -k1n all_level_count.csv > all_level_count-sorted.csv
-
-
-1. Create a list of WSI located under: /scratch/wxc4/CAMELYON16-testing/
-   cd /projects/mikem/UserSupport/weizhe.li/split_wsi
-   ls -1 /scratch/wxc4/CAMELYON16-testing > list.txt
-
-2. Run the below job to split TIF files intosmaller HDF5 files 
-   qsub split_grp.sh 
-
-3. Geneate look-up table:
-   bash create_lookup_grp.sh 
-   
-4. Run below script to process all HDF5 file in parallel.
-   qsub process_main.sh
-
-process_images_grp.o7677468
-
-At [mikem@betsy02 split_wsi]$:
-RESULT=grp_timing1
-BASEDIR=/projects/mikem/UserSupport/weizhe.li/split_wsi/sysout_process_images_grp
-APP_PREFIX=process_images_grp.o7678134
-find $BASEDIR -name "$APP_PREFIX.*" | xargs grep "seconds" > "$RESULT".txt
-sort -k2 -n "$RESULT".txt > "$RESULT"_sorted.txt 
-
-
-find /projects/mikem/UserSupport/weizhe.li/split_wsi/sysout_process_images_ds -name "process_images_ds.o7676383.*" | xargs grep "seconds" > ds_timing1.txt
-sort -k2 -n ds_timing1.txt > ds_timing1_sorted.txt 
-
-find /projects/mikem/UserSupport/weizhe.li/split_wsi/sysout_process_images_ds -name "process_images_ds.o7674135.*" | xargs grep "seconds" > ds_timing.txt
-sort -k2 -n ds_timing.txt > ds_timing_sorted.txt 
-find /projects/mikem/UserSupport/weizhe.li/split_wsi/sysout_process_images -name "process_images.o7673281.*" | xargs grep "seconds" > file_timing.txt
-sort -k2 -n file_timing.txt > file_timing_sorted.txt 
-
-create_dataset, see: http://docs.h5py.org/en/stable/high/group.html
- 
-
-# for splitting and grouping 
-D=/scratch/mikem/UserSupport/weizhe.li/runs_split_group/448_400_7684656/sysout
-find $D -name "split*" | xargs grep "real" > split_timing.txt
- 
- 
- 
- 
-