How to use KEGG Mapper to visualize your data on KEGG Pathways?
How to use KEGG Mapper to visualize your data on KEGG Pathways? The KEGG database allows users to “paint” their own data on the pathways referenced in the database. This tool will allow you to visualize gene expression data including RNA-seq. As a first time user, formatting your data for analysis can be a daunting task and this tutorial is here to help you through it. You can access the KEGG Search Pathway tool here, and the KEGG Search & Color Pathway tool here.1. Identify your organism
One of the most important part of this process is to identify the “prefix” for the organism which pathways you want to paint your data on. A full list of these prefixes can be found here. Here is a list of the current prefixes available for Bordetella:
- Bordetella pertussis Tohama I: [bpe]
- Bordetella pertussis CS: [bpc]
- Bordetella pertussis 18323: [bper]
- Bordetella pertussis B1917: [bpet]
- Bordetella pertussis 137: [bpeu]
- Bordetella parapertussis 12822: [bpa]
- Bordetella parapertussis Bpp5: [bpar]
- Bordetella bronchiseptica RB50: [bbr]
- Bordetella bronchiseptica MO149: [bbm]
- Bordetella bronchiseptica 253: [bbh]
- Bordetella bronchiseptica S798: [bbx]
- Bordetella petrii: [bpt]
- Bordetella avium: [bav]
- Bordetella holmesii ATCC51541: [bho]
- Bordetella holmesii 44057: [bhm]
Enter the prefix indicated in the brackets of the organism of interest in the field “Search against:”
2. Format your data for analysis
The database has recently simplified the data input format. You can now simply paste a list of genes or format it to highlight different genes in different colors. Data can be formatted as locus tag or as gene name.
a. Simple data input (Search Pathway)
To simply visualize your data on the KEGG pathways, simply paste your list of genes in the field “Enter Objects” here. These genes will be marked on the pathway in red.
If you want to test it, you can use the following set of genes:
BPt01 BPt02 BPt03 BPt04 secE nusG rplK rplA rplJ glyQ
b. Advanced data input (Search & Color Pathway)
You can decide the color in which the genes will be displayed on the pathway. You can create different groups of genes and display various levels of gene expression in different colors. In this example, we will display in blue genes over-expressed in a dataset and in red the genes that are down-regulated. Gene names or locus can be simply formatted as:
locus color_background,color_text
Here is an example:
BPt01 blue,black BPt02 blue,black BPt03 blue,black BPt04 blue,black secE blue,black nusG blue,black rplK red,black rplA red,black rplJ red,black glyQ red,black
Some of the colors that can be used are: black, grey, brown, red, orange, yellow, gold, pink, magenta, violet, purple, maroon, blue, cyan, green, chartreuse,… (do not use white).
If you do not want to type, download this file to format your dataset:
Download dataset format file3. Visualize the data.
Fill the fields “Search against:” and “Enter objects”, leave the rest of the fields/options on the default settings. Hit “Exec”. Pathway search results will indicate:
- Objects not found: genes not found in the database either because the locus/gene name is not present in the database or because it has not been associated with any pathway for this organism.
- List of pathways with hits. The number in parenthesis at the end of each line indicates the number of hits in each pathway.
By clicking on each pathway, you can now visualize the genes in your dataset with different colors if you used the Search and Color Pathway option.