Digest for genome@soe.ucsc.edu - 7 updates in 3 topics

g***@soe.ucsc.edu

2014-10-11 17:18:20 UTC

=============================================================================
Today's topic summary
=============================================================================

Group: ***@soe.ucsc.edu
Url:
https://groups.google.com/a/soe.ucsc.edu/forum/?utm_source=digest&utm_medium=email/#!forum/genome/topics

- Need help with running BLAT on CentOS [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/6397885d9d8f22bd
- rn6 GTF question [4 Updates]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/6490c59bc5744997
- Transcriptional Start Site Output [2 Updates]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/6e9665e219a1bcce

=============================================================================
Topic: Need help with running BLAT on CentOS
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/6397885d9d8f22bd
=============================================================================

---------- 1 of 1 ----------
From: Jonathan Casper <***@soe.ucsc.edu>
Date: Oct 10 05:51PM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/ea4016d01713a79b

Hello Aneesha,

You do not have to install gfServer and gfClient to run BLAT. You may find,
however, that gfServer and gfClient are a better way to use BLAT depending
on your needs. The gfServer daemon keeps some data in memory, which can
result in faster alignments if you plan to run a series of alignments
against the same target sequence. Otherwise, BLAT will run fine on a single
compute node if that is what you want to do.

I am still curious about your problems compiling BLAT from the source code
in blatSrc35.zip. Can you please go into the blatSrc/lib/ directory and run
the following command?

env CFLAGS="-E" make net.o

Then send me the resulting net.o file. It might help us figure out where
the problem is coming from.

I hope this is helpful. If you have any further questions, please reply to
***@soe.ucsc.edu or genome-***@soe.ucsc.edu. Questions sent to those
addresses will be archived in publicly-accessible forums for the benefit of
other users. If your question contains sensitive data, you may send it
instead to genome-***@soe.ucsc.edu.

--
Jonathan Casper
UCSC Genome Bioinformatics Group

=============================================================================
Topic: rn6 GTF question
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/6490c59bc5744997
=============================================================================

---------- 1 of 4 ----------
From: Christa-Lynn Blenck <***@Colorado.EDU>
Date: Oct 10 12:01PM -0600
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/722e92482f7bbe3b

Hello,

I am performing a sequencing experiment in rat tissue and I have been utilizing your newest rat genome assembly rn6. I downloaded the GTF file from your website for genes and gene predictions based upon the rn6 assembly. However, there aren't any genes in the GTF file on the mitochondrial chromosome. I was wondering if you know of a way for me to get a GTF file for all the chromosomes, including the mitochondria? I have attached a picture of the settings I used to download the GTF file.

Thanks for any help,

Christa Blenck

Graduate Student
University of Colorado at Boulder
Leinwand Lab

---------- 2 of 4 ----------
From: Luvina Guruvadoo <***@soe.ucsc.edu>
Date: Oct 10 12:17PM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/6d4a7338ec8353db

Hello Christa,

Thank you for your question. RefSeq does not provide mitochondrial
annotations. We recommend using the Ensembl track on the rn5 assembly
instead. Rn6 and rn5 use the same mitochondrial sequence. Using the Table
Browser, select the rn5 assembly, then select the "Ensembl Genes" track.
Type in "chrM" next to position. Select "GTF" as your output format and
click "get output".

If you have any further questions, please reply to ***@soe.ucsc.edu. All
messages sent to that address are archived on a publicly-accessible forum.
If your question includes sensitive data, you may send it instead to
genome-***@soe.ucsc.edu.

- - -
Luvina Guruvadoo
UCSC Genome Bioinformatics Group

On Fri, Oct 10, 2014 at 11:01 AM, Christa-Lynn Blenck <

---------- 3 of 4 ----------
From: Christa-Lynn Blenck <***@Colorado.EDU>
Date: Oct 10 02:18PM -0600
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/51410fa9207e7d51

Thanks for your help, I just have one more question on this matter. If I use the rn5 Ensemble track, should I map my sequences to the rn5 genome instead of the rn6? I only ask because so far I have done all of my analysis mapping my reads to the rn6 assembly, but do I need to re-do that analysis using both the rn5 genome and gtf? Or are you suggesting to add the rn5 mitochondrial annotations to my rn6 gtf?

Thanks,

Christa Blenck

---------- 4 of 4 ----------
From: Luvina Guruvadoo <***@soe.ucsc.edu>
Date: Oct 10 01:24PM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/7ef1d2cc0e30f0d6

Hi Christa,

No, you would not have to re-do your analysis; I was suggesting you just
add the rn5 mitochondrial annotations.The chrM sequence is the same for
both rn5 and rn6.

If you have any further questions, please reply to ***@soe.ucsc.edu. All
messages sent to that address are archived on a publicly-accessible forum.
If your question includes sensitive data, you may send it instead to
genome-***@soe.ucsc.edu.

- - -
Luvina Guruvadoo
UCSC Genome Bioinformatics Group

On Fri, Oct 10, 2014 at 1:18 PM, Christa-Lynn Blenck <

=============================================================================
Topic: Transcriptional Start Site Output
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/6e9665e219a1bcce
=============================================================================

---------- 1 of 2 ----------
From: ruvalcabatrejo <***@gmail.com>
Date: Oct 10 10:39AM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/e63884efd477bb43

Hello,

I searched for the transcriptional start site, cds start, exon count, and
protein ID for the human gene POU5F1. The output shows correct information
for the first 6 lines starting with uc003nsu.4, but I am not what the rest
of the information means. The bottom half repeats the protein IDs several
times, each time with a different TSS and CDS start. I am new at using the
browser and I am not sure if I need to add another filter?

#filter: kgXref.geneSymbol = 'pou5F1'
#hg38.knownGene.txStart hg38.knownGene.cdsStart hg38.knownGene.exonCount hg38.knownGene.proteinID hg38.kgXref.geneSymbol
31164336 31164600 4 M1S623 POU5F1
31164336 31164600 5 M1S623 POU5F1
31164336 31164600 3 F2Z381 POU5F1
31164336 31164600 5 Q01860 POU5F1
31165950 31165950 1 POU5F1
31170215 31170215 1 POU5F1
2646760 2647024 4 M1S623 POU5F1
2646760 2647024 5 M1S623 POU5F1
2646760 2647024 4 F2Z381 POU5F1
2646760 2647024 5 Q01860 POU5F1
2648384 2648384 1 POU5F1
2652657 2652657 1 POU5F1
2425276 2425276 1 POU5F1
2423652 2423916 4 M1S623 POU5F1
2423652 2423916 5 M1S623 POU5F1
2423652 2423916 4 F2Z381 POU5F1
2429549 2429549 1 POU5F1
2423652 2423916 5 Q01860 POU5F1
2474837 2475101 4 M1S623 POU5F1
2474837 2475101 5 M1S623 POU5F1
2474837 2475101 4 F2Z381 POU5F1
2474837 2475101 5 Q01860 POU5F1
2476461 2476461 1 POU5F1
2480735 2480735 1 POU5F1
2508451 2508715 4 M1S623 POU5F1
2508451 2508715 5 M1S623 POU5F1
2508451 2508715 4 F2Z381 POU5F1
2508451 2508715 5 Q01860 POU5F1
2510075 2510075 1 POU5F1
2514349 2514349 1 POU5F1
2422369 2422633 4 M1S623 POU5F1
2422369 2422633 5 M1S623 POU5F1
2422369 2422633 4 F2Z381 POU5F1
2422369 2422633 5 Q01860 POU5F1
2423993 2423993 1 POU5F1
2428265 2428265 1 POU5F1
2467752 2468016 4 M1S623 POU5F1
2467752 2468016 5 M1S623 POU5F1
2467752 2468016 4 F2Z381 POU5F1
2467752 2468016 5 Q01860 POU5F1
2469376 2469376 1 POU5F1
2473647 2473647 1 POU5F1

Thank you for your help,

Laura

---------- 2 of 2 ----------
From: Luvina Guruvadoo <***@soe.ucsc.edu>
Date: Oct 10 12:02PM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/ec10d68a2998cd99

Hello Laura,

Thanks for your question. If you include the chromosome name in your
output, you will find that this particular gene is on chr6, and all of the
other results are various transcripts from multiple haplotype chromosomes:

#filter: kgXref.geneSymbol = 'pou5F1'
#hg38.kgXref.geneSymbol hg38.knownGene.chrom hg38.knownGene.txStart hg38.knownGene.cdsStart hg38.knownGene.exonCount hg38.knownGene.proteinID
POU5F1 chr6 31165950 31165950 1
POU5F1 chr6 31170215 31170215 1
POU5F1 chr6 31164336 31164600 5 Q01860
POU5F1 chr6 31164336 31164600 4 M1S623
POU5F1 chr6 31164336 31164600 5 M1S623
POU5F1 chr6 31164336 31164600 3 F2Z381
POU5F1 chr6_GL000251v2_alt 2648384 2648384 1
POU5F1 chr6_GL000251v2_alt 2652657 2652657 1
POU5F1 chr6_GL000251v2_alt 2646760 2647024 5 Q01860
POU5F1 chr6_GL000251v2_alt 2646760 2647024 4 M1S623
POU5F1 chr6_GL000251v2_alt 2646760 2647024 5 M1S623
POU5F1 chr6_GL000251v2_alt 2646760 2647024 4 F2Z381
POU5F1 chr6_GL000252v2_alt 2425276 2425276 1
POU5F1 chr6_GL000252v2_alt 2429549 2429549 1
POU5F1 chr6_GL000252v2_alt 2423652 2423916 5 Q01860
POU5F1 chr6_GL000252v2_alt 2423652 2423916 4 M1S623
POU5F1 chr6_GL000252v2_alt 2423652 2423916 5 M1S623
POU5F1 chr6_GL000252v2_alt 2423652 2423916 4 F2Z381
POU5F1 chr6_GL000253v2_alt 2476461 2476461 1
POU5F1 chr6_GL000253v2_alt 2480735 2480735 1
POU5F1 chr6_GL000253v2_alt 2474837 2475101 5 Q01860
POU5F1 chr6_GL000253v2_alt 2474837 2475101 4 M1S623
POU5F1 chr6_GL000253v2_alt 2474837 2475101 5 M1S623
POU5F1 chr6_GL000253v2_alt 2474837 2475101 4 F2Z381
POU5F1 chr6_GL000254v2_alt 2510075 2510075 1
POU5F1 chr6_GL000254v2_alt 2514349 2514349 1
POU5F1 chr6_GL000254v2_alt 2508451 2508715 5 Q01860
POU5F1 chr6_GL000254v2_alt 2508451 2508715 4 M1S623
POU5F1 chr6_GL000254v2_alt 2508451 2508715 5 M1S623
POU5F1 chr6_GL000254v2_alt 2508451 2508715 4 F2Z381
POU5F1 chr6_GL000255v2_alt 2423993 2423993 1
POU5F1 chr6_GL000255v2_alt 2428265 2428265 1
POU5F1 chr6_GL000255v2_alt 2422369 2422633 5 Q01860
POU5F1 chr6_GL000255v2_alt 2422369 2422633 4 M1S623
POU5F1 chr6_GL000255v2_alt 2422369 2422633 5 M1S623
POU5F1 chr6_GL000255v2_alt 2422369 2422633 4 F2Z381
POU5F1 chr6_GL000256v2_alt 2469376 2469376 1
POU5F1 chr6_GL000256v2_alt 2473647 2473647 1
POU5F1 chr6_GL000256v2_alt 2467752 2468016 5 Q01860
POU5F1 chr6_GL000256v2_alt 2467752 2468016 4 M1S623
POU5F1 chr6_GL000256v2_alt 2467752 2468016 5 M1S623
POU5F1 chr6_GL000256v2_alt 2467752 2468016 4 F2Z381

If you have any further questions, please reply to ***@soe.ucsc.edu. All
messages sent to that address are archived on a publicly-accessible forum.
If your question includes sensitive data, you may send it instead to
genome-***@soe.ucsc.edu.

- - -
Luvina Guruvadoo
UCSC Genome Bioinformatics Group

On Fri, Oct 10, 2014 at 10:39 AM, ruvalcabatrejo <***@gmail.com>
wrote:

--
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page:
https://groups.google.com/a/soe.ucsc.edu/forum/?utm_source=digest&utm_medium=email/#!forum/genome/join
.
To unsubscribe from this group and stop receiving emails from it send an email to genome+***@soe.ucsc.edu.

To unsubscribe from this group and stop receiving emails from it, send an email to genome+***@soe.ucsc.edu.