1. Load TrendCatcher R package.

2. Read in the demo count table.

TrendCatcher requires the input count table in CSV file and with the column name in format as “ProjectName_Time_Rep1”. And the first column as gene symbol or gene ensembl ID.

example.file.path<-system.file("extdata", "Brain_DemoCountTable.csv", package = "TrendCatcher")
tb<-read.csv(example.file.path, row.names = 1)
head(tb)
##                    B_0_Rep1 B_0_Rep2 B_0_Rep3 B_0_Rep4 B_0_Rep5 B_6_Rep1
## ENSMUSG00000000001      118      104      115      119       98      127
## ENSMUSG00000000028        2        3        3        2        3        3
## ENSMUSG00000000031        2        1        2        2        2        2
## ENSMUSG00000000037        1        1        1        0        2        1
## ENSMUSG00000000056       57       48       41       51       44       45
## ENSMUSG00000000058      215      315      226      314      220       49
##                    B_6_Rep2 B_6_Rep3 B_6_Rep4 B_6_Rep5 B_24_Rep1 B_24_Rep2
## ENSMUSG00000000001      163      140      167       68       178       182
## ENSMUSG00000000028        2        2        4        1         4         3
## ENSMUSG00000000031        1        2        5        1        10         5
## ENSMUSG00000000037        1        0        1        1         0         0
## ENSMUSG00000000056       37       44       52       56        30        38
## ENSMUSG00000000058       68       58       81      246        30       106
##                    B_24_Rep3 B_24_Rep4 B_48_Rep1 B_48_Rep2 B_48_Rep3 B_48_Rep4
## ENSMUSG00000000001       162       200       121       115       405       136
## ENSMUSG00000000028         4         3        11         7         5        15
## ENSMUSG00000000031         3         1         2         2         1         5
## ENSMUSG00000000037         0         1         1         1         1         1
## ENSMUSG00000000056        27        23        33        42        16        44
## ENSMUSG00000000058        58        75       204       187        92       139
##                    B_48_Rep5 B_48_Rep6 B_72_Rep1 B_72_Rep2 B_72_Rep3 B_72_Rep4
## ENSMUSG00000000001       193       201       127       151       118       125
## ENSMUSG00000000028         8         4        10        10        24        20
## ENSMUSG00000000031         1         2         2         2         2         9
## ENSMUSG00000000037         1         1         1         1         1         1
## ENSMUSG00000000056        28        32        48        46        47        47
## ENSMUSG00000000058        78       127       251       278       211       224
##                    B_168_Rep1 B_168_Rep2 B_168_Rep3 B_168_Rep4
## ENSMUSG00000000001         67        105         86         98
## ENSMUSG00000000028          2          2          2          3
## ENSMUSG00000000031          1          1          2          4
## ENSMUSG00000000037          1          1          1          1
## ENSMUSG00000000056         47         64         78         81
## ENSMUSG00000000058        253        159        311        354

3. Run TrendCatcher and generate master.list object

This function will take few minutes to finish running with multiple cores.

example.file.path<-system.file("extdata", "Brain_DemoCountTable.csv", package = "TrendCatcher")

master.list<-run_TrendCatcher(count.table.path = example.file.path,
baseline.t = 0,
time.unit = "h",
min.low.count = 1,
para.core.n = NA,
dyn.p.thres = 0.05)

4. Check master.list in detail.

To save time of running, we already put the output master.list object in the ‘/inst/extdata’ folder. You can simply load it into your environment.

demo.master.list.path<-system.file("extdata", "BrainMasterList.rda", package = "TrendCatcher")
load(demo.master.list.path)

First, check what elements are included in the master.list list object.

names(master.list)
## [1] "time.unit"    "baseline.t"   "t.arr"        "Project.name" "raw.df"      
## [6] "fitted.count" "master.table"
  • “time.unit” is the time unit, for example “h” represents hour.
  • “baseline.t” is the baseline time. Here it is hour 0.
  • “t.arr” is the time array in time course study. Here is 0, 6, 24, 48, 72, 168h.
  • “Project.name” is the project name. Here is “B”.
  • “raw.df” is the count table ordered by time and replicate ID.
  • “fitted.count” is the ANOVA smooth model fitted count number from replicates.
print(c(master.list$time.unit, master.list$baseline.t))
## [1] "h" "0"
master.list$t.arr
## [1]   0   6  24  48  72 168
master.list$Project.name
## [1] "B"
head(master.list$raw.df)
##                    B_0_Rep1 B_0_Rep2 B_0_Rep3 B_0_Rep4 B_0_Rep5 B_6_Rep1
## ENSMUSG00000000001      118      104      115      119       98      127
## ENSMUSG00000000028        2        3        3        2        3        3
## ENSMUSG00000000031        2        1        2        2        2        2
## ENSMUSG00000000037        1        1        1        0        2        1
## ENSMUSG00000000056       57       48       41       51       44       45
## ENSMUSG00000000058      215      315      226      314      220       49
##                    B_6_Rep2 B_6_Rep3 B_6_Rep4 B_6_Rep5 B_24_Rep1 B_24_Rep2
## ENSMUSG00000000001      163      140      167       68       178       182
## ENSMUSG00000000028        2        2        4        1         4         3
## ENSMUSG00000000031        1        2        5        1        10         5
## ENSMUSG00000000037        1        0        1        1         0         0
## ENSMUSG00000000056       37       44       52       56        30        38
## ENSMUSG00000000058       68       58       81      246        30       106
##                    B_24_Rep3 B_24_Rep4 B_48_Rep1 B_48_Rep2 B_48_Rep3 B_48_Rep4
## ENSMUSG00000000001       162       200       121       115       405       136
## ENSMUSG00000000028         4         3        11         7         5        15
## ENSMUSG00000000031         3         1         2         2         1         5
## ENSMUSG00000000037         0         1         1         1         1         1
## ENSMUSG00000000056        27        23        33        42        16        44
## ENSMUSG00000000058        58        75       204       187        92       139
##                    B_48_Rep5 B_48_Rep6 B_72_Rep1 B_72_Rep2 B_72_Rep3 B_72_Rep4
## ENSMUSG00000000001       193       201       127       151       118       125
## ENSMUSG00000000028         8         4        10        10        24        20
## ENSMUSG00000000031         1         2         2         2         2         9
## ENSMUSG00000000037         1         1         1         1         1         1
## ENSMUSG00000000056        28        32        48        46        47        47
## ENSMUSG00000000058        78       127       251       278       211       224
##                    B_168_Rep1 B_168_Rep2 B_168_Rep3 B_168_Rep4
## ENSMUSG00000000001         67        105         86         98
## ENSMUSG00000000028          2          2          2          3
## ENSMUSG00000000031          1          1          2          4
## ENSMUSG00000000037          1          1          1          1
## ENSMUSG00000000056         47         64         78         81
## ENSMUSG00000000058        253        159        311        354

The fitted.count table each column represents.

  • Gene, gene ensembl ID or gene symbol from the raw data.
  • Time, time measured.
  • Fit.Count, the ANOVA smooth model fitted count value.
  • mu, the estimated mean count of baseline expression.
  • disp, dispersion estimated of the baseline expression.
  • t.p.val, the dynamic p-value at each time point compared to baseline NB distribution.
  • dyn.p.val, the combined dynamic p-value from all time points.
  • dyn.p.val.adj, the adjusted dynamic p-value from all DDEGs.
head(master.list$fitted.count)
##                 Gene Time Fit.Count    mu     disp    t.p.val dyn.p.val
## 1 ENSMUSG00000000001    0 110.80002 110.8 12.88174 1.00000000  8.67e-03
## 2 ENSMUSG00000000001    6 137.81128 110.8 12.88174 0.19627656  8.67e-03
## 3 ENSMUSG00000000001   24 175.52409 110.8 12.88174 0.03646947  8.67e-03
## 4 ENSMUSG00000000001   48 185.67185 110.8 12.88174 0.02176459  8.67e-03
## 5 ENSMUSG00000000001   72 139.55592 110.8 12.88174 0.18189626  8.67e-03
## 6 ENSMUSG00000000001  168  88.65132 110.8 12.88174 0.26173223  8.67e-03
##   dyn.p.val.adj
## 1    0.06409249
## 2    0.06409249
## 3    0.06409249
## 4    0.06409249
## 5    0.06409249
## 6    0.06409249

For the master table, each column represents.

  • Gene, gene ensembl ID or gene symbol.
  • pattern, the gene trajectory master-pattern type.
  • start.idx, the index of where the corresponding trend start in t.arr.
  • end.idx, the index of where the corresponding trend end in the t.arr.
  • dynTime, the time where t.p.val <=0.05, the break point.
  • dynSign, “-” means down, “+” means up.
  • start.t, the time of where the corresponding trend start.
  • end.t, the time of where the corresponding trend end.
  • pattern_str, the string ready for print indicates the gene trajectory sub-pattern type.
head(master.list$master.table)
##                 Gene  pattern start.idx end.idx     dynTime  dynSign start.t
## 1 ENSMUSG00000025283 up_down_      1_4_    4_6_ 6_24_48_72_ +_+_+_+_   0_48_
## 2 ENSMUSG00000028967 up_down_      1_2_    2_6_    6_24_48_   +_+_+_    0_6_
## 3 ENSMUSG00000039236 up_down_      1_2_    2_6_       6_24_     +_+_    0_6_
## 4 ENSMUSG00000078920 up_down_      1_2_    2_6_    6_24_48_   +_+_+_    0_6_
## 5 ENSMUSG00000105987 up_down_      1_4_    4_6_ 6_24_48_72_ +_+_+_+_   0_48_
## 6 ENSMUSG00000022221 up_down_      1_3_    3_6_ 6_24_48_72_ +_+_+_+_   0_24_
##     end.t         pattern_str dyn.p.val dyn.p.val.adj
## 1 48_168_ 0h_up_48h_down_168h  1.11e-16  4.615657e-13
## 2  6_168_  0h_up_6h_down_168h  1.11e-16  4.615657e-13
## 3  6_168_  0h_up_6h_down_168h  1.11e-16  4.615657e-13
## 4  6_168_  0h_up_6h_down_168h  1.11e-16  4.615657e-13
## 5 48_168_ 0h_up_48h_down_168h  2.22e-16  7.385052e-13
## 6 24_168_ 0h_up_24h_down_168h  3.33e-16  9.231315e-13