DeDaL: Data-Driven Network Layout
Cytoscape 3.0 app for producing an morphing data-driven and structure-driven network layouts

Introduction | Downloads | References | Tutorial | Contacts

Introduction

DeDaL (Data-Driven network Layout) is a Cytoscape 3.0 [1] app developed by the Computational Systems Biology of Cancer group in Bioinformatics Laboratory of Institut Curie (Paris). http://cytoscape.org   

Scientific article about DeDaL has been published in BMC Systems Biology [10]

Source code of DeDaL is accessible in a github repository.

DeDaL is an universal tool and an example of non-biological application of DeDaL can be found here.

The knowledge on molecular interactions in living cells is usually represented in the form of network diagrams, depicting, for instance, protein-protein interactions, biochemical reactions or more abstract influences of some molecule onto another molecules, etc. Providing an insightful layout for such diagrams is not a trivial problem. On the other hand, large amount of data is produced by application of high throughput biotechnologies. There is an urgent need of developing new methods for integrating the information provided in biological diagrams with the multidimensional -omics datasets. Classically, high-throughput data are mapped on top of the network layouts computed based on the network structure.

http://cytoscape.org


DeDaL is a Cytoscape 3.0 app which uses several algorithms of dimention reduction to produce data-driven network layouts based on multidimensional data (typicaly gene expression). DeDaL implements several data pre-processing and layout post-processing steps such as continuous morphing between two arbitrary network layouts and aligning one network layout with respect to another by rotating and mirroring. Combining these possibilities facilitates creating insightful network layouts representing both structural network features and the correlation patterns in multivariate data. The app is implemented in Java. DeDaL app has the following functions:

  1. Data-Driven Layout:
      pre-processing options:
      • double center data
      • network-smooth data
    • Principal Component Analysis (PCA)
    • Elastic map (non linear PCA)
      • save processed data
  2. Layout aligning
  3. Layout morphing

We propose you to follow our Tutorial to see an example of using DeDaL with expression data for two cancer subtypes on Fanconi DNA repair pathway.

We also applied DeDaL to the network of proteins interacting with ESR1 protein [Fig 3]. In this case, the second principal component shows, for example, that the expression levels of EGFR and CCNE1 are differently modulated though both are upregulated in the basal-like subtype. PCA layout also highlights a particular pattern of ex- pression of some hub genes such as AR or EGFR, and shows that underexpressed genes in basal-like subtype forms more tightly-connected subnetwork. Morphing the original organic network layout with the PCA-based layout moves position of some of the proteins, keeping the general pattern of PCA preserved. For example, underexpressed PIK3R1, IGFR1 and ERBB2 genes are moved on the left because each of them is connected to several overexpressed genes. Application of network smoothing drives the hub genes to the center of the layout, because of averaging over the hub’s neighbors. It produces more regular pattern of network connections but approximately conserves the neighborhood relations in PCA layout. DeDaL allows to easily identify group of genes with similar expression pattern even within the group of similar level of expression. I addition it is easy to morph between structure based layout and the data-driven layout which gives the user an opportunity to access and visualize both informatons easily without using additional tools.

It is also possible to used DeDaL to visualize genetic information . We applied DeDaL to create a DDL layout for a group of yeast genes involved in DNA repair and replication. The genetic interactions between these genes and the epistatic profiles (computed only with respect to this group of genes) were used from [7]. The definitions of DNA repair pathways were taken from KEGG database [8]. Figure 4 shows the difference between application of the standard organic layout for this small network of genetic interactions and PCA-based DDL (computed here without applying data matrix double-centering to take into account tendencies of genes to interact with smaller or larger number of other genes). PCA- based DDL in this case groups the genes with respect to their epistatic profiles. Firstly, local hub genes RAD27 and POL32 have distinct position in this layout. Secondly, PCA-based DDL roughly groups the genes accordingly to the DNA repair pathway in which they are involved. For example, it shows that Non-homologous end joining DNA repair pathway is closer to Homologous recombination (HR) pathway than to the Mismatch repair pathway. It also underlines that some homologous recombination genes (such as RDH54) are characterized by a different pattern of genetic interactions than the “core” HR genes RAD51, RAD52, RAD54, RAD55,RAD57.

In the next example we apply DeDaL to the Boolean model of cell fate decisions between survival, apoptosis and non-apoptotic cell death (such as necrosis) published in [9], to group the nodes of the influence diagram accordingly to their co-activation patterns in the logical steady states. The table of steady states was taken from [9] (Figure 5, top right) and used to compute the PCA-based DDL (Figure 5, bottom left). In this DDL, nodes in close positions have similar pattern of activation in steady states (such as RIP1 and RIP1K). We used morphing PCA-based DDL and the initial layout of the model (as it was designed in [9]) to visualize several stable states corresponding to different cell fates (Figure 6). In this layout co-activated nodes tend to form compact groups. Therefore, DeDaL can be used to design layouts of mathematical models of biological networks, using the solutions of the model.



Downloads

Install DeDaL test version

Download File DeDaL (7Mb) last release.

Using the Cytoscape app manager

  1. Launch the Cytoscape app manager (menu "Apps -> App Manager-> Install from File . . . ").
  2. Select the DeDaL.jar file
  3. Click "Install".

Download files for a tutorial:

  • network (network.sif)
  • data (data.txt)
  • style (fanconi_style_file.xml)

    Tutorial

    In this tutorial you will follow a set of instructions in order to get acknowledged with DeDaL standard functions. We used The Cancer Genome Atlas (TCGA) breast cancer transcriptomic dataset (548 patients) and Human Reference Protein Database (HPRD[6]) database as a source of protein-protein interaction network. As an example of a small subnetwork, we selected proteins involved in Fanconi anemia DNA repair pathway as it is defined in Atlas of Cancer Signaling Network (ACSN). Data set. used in this tutorial is a public data set. And it is accessible from TGCA database. Dataset contains also a column with values of the t-test computed for the gene expression difference between the basal-like (one of the molecular subtypes of breast cancer, significantly contributing to the intertumoral variability) and non basal-like breast tumours, that will be used for node coloring.

    In this exercise we will identify patterns in the data using PCA with network smoothing and double centering, we will perform morphing between a purely structure based layout and the PCA layout and we will align networks to allow an easier comparison between networks.
    1. Download files from the Downloads
    2. Open Cytoscape 3.0:
      • a new Session is opened automatically, if not click File->New->Session
    3. Import network file:
      • File->Import->Network->File . . .
      • Select the downloaded file: network.sif from your files
      • Click OK.
    4. Install the app using the Cytoscape app manager
      • Launch the Cytoscape app manager (menu "Apps -> App Manager-> Install from File . . . ").
      • Select the DeDaL.jar file
      • Click "Install".
      • If you can see the App Manager window as shown below, it means the app is correctly installed
      • Close the dialog window of the App Manager
    5. Load data:
      • File->Import->Table->File . . .(data.txt)
      • Leave parameters by default
      • Click OK.
    6. Color data according to the T-test values
      • File->Import->Network->Style . . .
      • Go to Style in Control Panel on the left
      • Change the style to default_0
      • You should observe nodes changed shape and are filled according to the t-test values (_TTEST column)
    7. Apply Data-Driven network Layout (PCA):
      • Layout->DeDaL->Data-Driven Layout (!) in the most recent version DeDaL appears in Layout, not Tools
      • You will see a dialogue window (compare with the screen shot below):
        • Check "double center"
        • Keep PCA button as default as well as PC1 and PC2 (principal components one and two will be projected)
        • Select all colums starting with BAS (for basal data) or NBAS (for non-basal). For multiple selection select the first entry, pull SHIFT button on the keyboard and click on the last entry you want to select.
        • Click "OK ans save data"
      • You should see now the PCA layout of the network and the Report window with a percentage of variance explained by each PC (up to 10)
      • You can repeat exactly the same steps checking "network-smooth data" to observe the result of network soothing.
      • you can also use the file to visualize or treat data in any other softwar after smoothing/double centering.
              

        One can see that in the PCA layout the first principal component sorts the nodes accordingly to the t-test, because in this case the first principal component is associated with the basal-like breast cancer subtype. The second principal component gives additional information such as that the expression levels of BRCA2 and FANCE are differently modulated though both are upregulated in the basal-like subtype.

           

        The layout preserves the general pattern of the PCA-based DDL, while better visualizing the network structure, and moving some proteins into a different position. For example, BRCA1 gene is moved to the right because it is connected to several genes overexpressed in basal-like breast cancer subtype.

    8. Layout morphing: gradual transformation from one layout into another
      • File->New->Network->Clone current network
      • Layout->yFiles Layouts-> Organic - this is a layout purely network structure based (you can choose any other layout if you wish)
      • Make sure the active network is network.sif_2 (should be highlighted in the left panel)
      • Go to Tools->DeDaL-> Layout morphing
      • In the dialogue window select network.sif and "align"
      • Click OK.
      • A new network will be opened and you will see the Slider dialogue
      • Move the cursor to the right and follow a gradual transformation from an organic layout into a PCA layout

        Morphing the organic network layout with the PCA-based layout moves position of some of the genes, keeping the general pattern of PCA preserved, while better reflecting the network structure.

      • set it on a desired position and close the Slider window
    9. Layout aligning

      Morphing between two network layouts might be meaningless if all nodes in one layout are systematically rotated or flipped with respect to the node positions in another layout. This situation is often the case when producing the pure data-driven layout and comparing it to the initial structure-driven layout. In this case, DeDaL allows minimizing the Euclidean distance between two layouts defined as the sum of squared Euclidean distances between all matched nodes with respect to all possible rotations and mirroring of one of the layouts.

      • Go to Tools->DeDaL->Layout aligning
      • Select reference network network.sif and Network to Align : network.sif_2 (in the screen shot, 3 extreme nodes are encircled with different colors and with the same colors the corresponding nodes in the second network)
      • Click OK
      • network.sif_1 (organic layout) is now rotated and/or mirrored to minimize distance to the network.sif the reference network (PCA Layout) what facilitates eye comparison (legend for the colors the same as above:3 extreme nodes are encircled with different colors and with the same colors the corresponding nodes in the second network)

    10. Save your working session
      • If you want to work on this session later: File->Save
    THE END
    Don't hesitate to contact us if you have questions

    Contacts


    References


    Acknowledgements

    Urszula Czerwinska is thankfull to the INSERM U900 unit for providing 5-month long internship, Eric Bonnet and Eric Viara for helping with the code and Loredana Martignetti for generously answering all questions.