Discussion:
Full transcriptome GTF files; hg19, custom track and whole genome
John Herbert
2011-04-15 14:31:37 UTC
Permalink
Dear Genome browsers,
Please can you tell me how to display a full genomes/transcriptomes worth of transcripts in the hg19 genome browser?

I make the assumption that a full transcriptome GTF file is too big and I was thinking of writing a GTF to Bed script (then to bigbed) so I can display it.

I did not find the answer in your archive and will appreciate any advice or program that does GTF to Bed.

Thank you in advance,

Kind regards,

John.
Katrina Learned
2011-04-18 16:35:16 UTC
Permalink
Hi John,

If your GTF is ~1.5 million lines or less, please gzip it and try
uploading it. If your GTF is significantly larger and/or the upload
fails for some other reason, you may need to do the extra work to
convert to bigBed. Doing so will take a few steps:

gtfToGenePred (which can be obtained from
http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/)
genePredToBed (a simple awk script pasted at the end of my message)
bedToBigBed (also obtained from
http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/)

The first two commands could be run in a pipe:

gtfToGenePred my.gtf stdout | genePredToBed > my.bed

gtfToGenePred can accept a .gtf.gz. bedToBigBed can't take a pipe as
input because it does several passes on the input file, so it has to be
a real bed file (and not .bed.gz either). The bed file doesn't have to
be kept after the bigBed file is created.

Please contact the mail list (***@soe.ucsc.edu) again if you have any
further questions.

Katrina Learned
UCSC Genome Bioinformatics Group

#!/usr/bin/awk -f

#
# Convert genePred file to a bed file (on stdout)
#
BEGIN {
FS="\t";
OFS="\t";
}
{
name=$1
chrom=$2
strand=$3
start=$4
end=$5
cdsStart=$6
cdsEnd=$7
blkCnt=$8

delete starts
split($9, starts, ",");
delete ends
split($10, ends, ",");
blkStarts=""
blkSizes=""
for (i = 1; i <= blkCnt; i++) {
blkSizes = blkSizes (ends[i]-starts[i]) ",";
blkStarts = blkStarts (starts[i]-start) ",";
}

print chrom, start, end, name, 1000, strand, cdsStart, cdsEnd, 0,
blkCnt, blkSizes, blkStarts
}
Post by John Herbert
Dear Genome browsers,
Please can you tell me how to display a full genomes/transcriptomes worth of transcripts in the hg19 genome browser?
I make the assumption that a full transcriptome GTF file is too big and I was thinking of writing a GTF to Bed script (then to bigbed) so I can display it.
I did not find the answer in your archive and will appreciate any advice or program that does GTF to Bed.
Thank you in advance,
Kind regards,
John.
_______________________________________________
https://lists.soe.ucsc.edu/mailman/listinfo/genome
Loading...