6 Graphical representations

As for any type of data analysis, appropriate graphical representations are of major importance for data understanding. Array-CGH data are typically ratios or log-ratios, that correspond to locations on the array (spots) and to locations on the genome (clones). Therefore in the case of array-CGH data normalization, two complementary types of representations are necessary:

a dotplot of the array, that takes into account the array design. This is a crucial tool in the case of array-CGH data normalization for two reasons: first it provides an easy way to identify spatial artifacts such as row, column, print-tip group effects, as well as spatial bias and spatial gradients on the array; then it allows a post-normalization control, to ensure that the normalization procedure reached its goals, i.e. significantly reduced the observed effects.

a plot of the signal values along the genome, which gives a visual impression of the array quality on the edge of biological relevance; comparing the signal shape before and after normalization provides a qualitative idea of the imrpovement in data quality provided by the normalization method.

The arrayPlot method provided by the GLAD package and based on maImage (2) addresses the first point; we add two methods to this toolbox:

the genome.plot method displays a plot of any signal value (e.g. log-ratios) along the genome;

the report.plot method successively calls arrayPlot and genome.plot in order to provide a simultaneous vision of the data using the two relevant metrics (array and genome), with approproate color scales.

6.1 genome.plot

if the variable is numeric (e.g. signal to noise ratio), the function assumes that it is a quantitative variable and adapts a color palette to its values (figure 3)

**Figure 3:** Pan-genomic profile of the array. Colors are proportional to log-ratio values.
> data(spatial) > genome.plot(edge.norm, chrLim = "LimitChr") $\includegraphics{MANOR-eval-017}$

if the variable is not numeric (e.g. the copy number variation as estimated by GLAD, or a character variable making the disitnction between flagged and un-flagged clones), the function counts the number of modalities of the variable and defines an appropriate color scale using the rainbow function (figure 4).

**Figure 4:** Pan-genomic profile of the array. Colors correspond to the values of the variable ``ZoneGNL''.
> data(spatial) > edge.norm$cloneValues$ZoneGNL <- as.factor(edge.norm$cloneValues$ZoneGNL) > genome.plot(edge.norm, col.var = "ZoneGNL", chrLim = "LimitChr") $\includegraphics{MANOR-eval-018}$

6.2 report.plot

arrayPlot

genome.plot

**Figure 5:** `report.plot`: array image and pan-genomic profile after normalization.
> data(spatial) > report.plot(edge.norm, chrLim = "LimitChr", zlim = c(-1, 1)) $\includegraphics{MANOR-eval-019}$

Pierre Neuvial 2007-03-16