Subsections

3 flag class

We view the process of filtering microarray data, and especially array-CGH data, as a succession of steps consisting in excluding from the data unreliable spots or clones (according to criteria such as signal to noise ratio or replicate consistency), and correcting signal values from various non-biologically relevant sources of variations (such as spotting effects, spatial effects, or intensity effects).

We introduce the formalism of flags to deal with this filtering issue: in the two following subsections, we describe the attributes and methods devoted to flag objects.

3.1 Attributes

A flag object f is a list whose most important items are a function (f$FUN) that has to be applied to an object of class arrayCGH, and a character value (f$char) that will allow us to identify the flagged spots. Optionally further arguments can be passed to f$FUN via f$args, and a label can be added via f$label. The examples of this subsection use the function to.flag, which is explained in subsection 3.2.

3.1.1 Exclusion and correction flags

As stated above, we make the distinction between flags that exclude spots from further analysis and flags that correct signal values:

exclusion flags

If f is an exclusion flag, f$FUN returns a list of spots to exclude and f$char is a non NULL value that quickly identifies the flag. In the following example, we define SNR.flag, a flag objects that excludes spots whose signal to noise ratio lower than the threshold snr.thr.

> SNR.FUN <- function(arrayCGH, var.FG, var.BG, snr.thr) {
+     which(arrayCGH$arrayValues[[var.FG]] < arrayCGH$arrayValues[[var.BG]] * 
+         snr.thr)
+ }
> SNR.char <- "B"
> SNR.label <- "Low signal to noise ratio"
> SNR.flag <- to.flag(SNR.FUN, SNR.char, args = alist(var.FG = "REF_F_MEAN", 
+     var.BG = "REF_B_MEAN", snr.thr = 3))

correction flags

If f is a correction flag, f$FUN returns an object of type arrayCGH and f$char is NULL. In the following example, global.spatial.flag computes a spatial trend on the array, and corrects the signal log-ratios from this spatial trend:

> global.spatial.FUN <- function(arrayCGH, var) {
+     if (!is.null(arrayCGH$arrayValues$Flag)) 
+         arrayCGH$arrayValues$LogRatio[which(arrayCGH$arrayValues$Flag != 
+             "")] <- NA
+     Trend <- arrayTrend(arrayCGH, var, span = 0.03, degree = 1, 
+         iterations = 3)
+     arrayCGH$arrayValues[[var]] <- Trend$arrayValues[[var]] - 
+         Trend$arrayValues$Trend
+     arrayCGH
+ }
> global.spatial.flag <- to.flag(global.spatial.FUN, args = alist(var = "LogRatio"))

3.1.2 Permanent and temporary flags

We introduce an additional distinction between permanent and temporary flags in order to deal with the case of spots or clone that are known to be biologically relevant, but that have not to be taken into account for the computation of a scaling normalization coefficient. For example in breast cancer, when the reference DNA comes from a male, we expect a gain of the X chromosome and a loss of the Y chromosome in the tumoral sample, and we do not want log-ratio values for X and Y chromosome to bias the estimation of a scaling normalization coefficient.

Any flag object therefore contains an argument called type, which defaults to "perm" (permanent) but can be set to "temp" in the case of a temporary flag. In the following example, chromosome.flag is a temporary flag that identifies clones correcponding to X and Y chromosome:

> chromosome.FUN <- function(arrayCGH, var) {
+     var.rep <- arrayCGH$id.rep
+     w <- which(!is.na(match(as.character(arrayCGH$cloneValues[[var]]), 
+         c("X", "Y"))))
+     l <- arrayCGH$cloneValues[w, var.rep]
+     which(!is.na(match(arrayCGH$arrayValues[[var.rep]], as.character(l))))
+ }
> chromosome.char <- "X"
> chromosome.label <- "Sexual chromosome"
> chromosome.flag <- to.flag(chromosome.FUN, chromosome.char, type = "temp.flag", 
+     args = alist(var = "Chromosome"), label = chromosome.label)

3.2 Methods

3.2.1 to.flag

The function to.flag is used of the creation of flag objects, with the specificities described in subsection 3.1.

> args(to.flag)
function (FUN, char = NULL, args = NULL, type = "perm.flag", 
    label = NULL) 
NULL

3.2.2 flag.arrayCGH

Function flag.arrayCGH simply applies function flag$FUN to a flag object for filtering, and returns:

a filtered array with field arrayCGH$arrayValues$Flag filled with the value of flag$char for each spot to be excluded from further analysis in the case of an exclusion flag;
an array with corrected signal value in the case of a correction flag.

> args(flag.arrayCGH)
function (flag, arrayCGH) 
NULL

3.2.3 flag.summary

Function flag.summary computes spot-level information about normalization (including the number of flagged spots and numeric normalization parameters), and displays it in a convenient way. This function can either be applied to an object of type arrayCGH:

> args(flag.summary.arrayCGH)
function (arrayCGH, flag.list, flag.var = "Flag", nflab = "not flagged", 
    ...) 
NULL

or to plain spot-level information, by using the default method:

> args(flag.summary.default)
function (spot.flags, flag.list, nflab = "not flagged", ...) 
NULL

Pierre Neuvial 2007-03-16