Discussion:
Digest for genome@soe.ucsc.edu - 16 Messages in 14 Topics
g***@soe.ucsc.edu
2013-07-22 17:10:46 UTC
Permalink
=============================================================================
Today's Topic Summary
=============================================================================

Group: ***@soe.ucsc.edu
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/topics

- Mining mouse ENCODE data [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/c0e4abbe4dae27bb
- variant annotation integrator [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/f0816e65d687f56
- ucsc knownGene mapping to the wrong uniprot entry ? [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/69ebe2f2d42b400e
- Gene Symbol for UCSC Genes [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/3f7861f4bfbbd36c
- Sessions using human genomes not working [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/187dc0439120e001
- Non-Human RefSeq Genes [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/53552d8c248613a
- Table Browser RefSeq Genes refGene GTF Output [2 Updates]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/e6412c75de707e1c
- Problem in MAF files? [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/17fe254789c0506e
- phylop acceleration score [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/48e3bbcd6681ad0b
- obtaining figure quality image [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/62d352b3db0b9364
- bigWigAverageOverBed accounts for strand information or not? [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/2c586b0bbb054106
- Model Generation With PhastCons [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/9650de2c5a312b6c
- Usage of data from your database for other database [2 Updates]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/a1235fbc126ac120
- mouse mm9; all custom tracks disappeared [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/11aae550472f7391


=============================================================================
Topic: Mining mouse ENCODE data
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/c0e4abbe4dae27bb
=============================================================================

---------- 1 of 1 ----------
From: Brian Lee <***@soe.ucsc.edu>
Date: Jul 22 09:36AM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/9ae0c2c9d8119c99

Dear Rebecca,

After sending my response, I realized that I was providing information
about the human assembly, and your inquiry was directed toward
investigations in mouse. You could use our liftOver utility to translate
your inquiry coordinates from the mouse assembly to hg19, and then do the
steps I gave, but below I wanted to provide the corresponding mm9 track
information.

I would initially suggest browsing the mm9 DNase tracks of interest, and
then doing a subtrack merge of either all the tracks, or just the ones in
cell lines you would prefer.

http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=mm9&g=wgEncodePsuDnase
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=mm9&g=wgEncodeUwDnase
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=mm9&g=wgEncodeUwDgf

After reviewing the tracks and data, in the Table Browser, make the
following selections to create a subtrack merge:

Clade: Mammal
Genome: Mouse
Assembly: July 2007 (NCBI37/mm9)
Group: Expression and Regulation
Track: PSU DNaseI HS or UW DNaseI DGF or UW DNaseI HS
Table: (select a table of interest,for example
wgEncodeUwDnase3134RiiiMImmortalPkRep1 from UW DNaseI HS)
(Be sure "region: genome" is selected)

2. Click "create" next to "subtrack merge:"

3. Click the boxes to other cell line tracks of interest (you probably only
want peak tracks, so unselect any other default tracks).

4. Select "All wgEncodeUwDnase3134RiiiMImmortalPkRep1 records, as well as
all records from all other selected subtracks", if you want everything
output together as one large track, then click "submit".

5. Change "output format" to "custom track" and then select "get output".

6. You can give this track a name, "Cell-line X,Y,Z merge"

7. Click "get custom track in genome browser" to see the results.

You can then follow the very first instructions in the last email regarding
the DNase Clusters track intersection, but instead use this "Cell-line
X,Y,Z merge" custom track with proper mm9 information in place of the hg19
DNase Clusters track.

If you want to liftOver your mm9 coordinates to the hg19 assembly, please
visit the liftOver page: http://genome.ucsc.edu/cgi-bin/hgLiftOver. You
could upload your mm9 sites in a text file of BED or chrN:start-end
formatted coordinates.

Thank you again for your inquiry and using the UCSC Genome Browser. If you
have further questions, please feel free to contact the mailing list again
at ***@soe.ucsc.edu.

All the best,

Brian Lee
UCSC Genome Bioinformatics Group





=============================================================================
Topic: variant annotation integrator
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/f0816e65d687f56
=============================================================================

---------- 1 of 1 ----------
From: "Andréanne Morin" <***@mail.mcgill.ca>
Date: Jul 22 04:25PM
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/afc1fe0e097e9e42




=============================================================================
Topic: ucsc knownGene mapping to the wrong uniprot entry ?
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/69ebe2f2d42b400e
=============================================================================

---------- 1 of 1 ----------
From: Pierre Lindenbaum <***@univ-nantes.fr>
Date: Jul 22 05:39PM +0200
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/b9a9b2591c3c5d6

Hi ucsc,

for your information:

The hg19.knownGene "uc010rxb.2" is said to be linked to the uniprot
entry: http://www.uniprot.org/uniprot/G5E986

however, the translated sequence in the UCSC browser
(http://genome.ucsc.edu/cgi-bin/hgGene?hgg_do_getProteinSeq=1&hgg_gene=uc010rxb.2
) starts with

MLGKLAMLLW....


whereas the uniprot entry starts with

MLLWVQ....

same problem for uc003wim.4 , uc002mee.1 uc011cxk.2 etc..

Regards,

Pierre



=============================================================================
Topic: Gene Symbol for UCSC Genes
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/3f7861f4bfbbd36c
=============================================================================

---------- 1 of 1 ----------
From: Andy Rampersaud <***@bu.edu>
Date: Jul 22 11:30AM -0400
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/d01758ed25368f18

Hello,

I'm trying to get the full set of UCSC genes in GTF format. I've used the
table browser to retrieve assembly: mm9 track: UCSC Genes table: knownGene
as well as the command:

genePredToGtf mm9 knownGene UCSC_Genes.gtf

But in both cases I see that the gene name and accession numbers are
different compared to the RefSeq genes. I would like to compare UCSC genes
with RefSeq genes to get a sense of common and unique genes.

Questions:
1. Why does the UCSC gene table use a different naming scheme (compared to
RefSeq genes)?
2. Is there a way to compare UCSC genes with RefSeq genes? Is there a way
to get gene symbol for UCSC genes?

Thanks,
Andy
--
Andy Rampersaud
Graduate Student, Bioinformatics
Waxman Lab, Boston University



=============================================================================
Topic: Sessions using human genomes not working
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/187dc0439120e001
=============================================================================

---------- 1 of 1 ----------
From: "Price, David H" <david-***@uiowa.edu>
Date: Jul 22 03:08PM
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/1dab06fa883d9f6e

All sessions I have saved that access the hg18 or hg19 have stopped working. I can hear the initial read (hard drive access) of the data off my server, but it is truncated and then the session times out after about a minute.
I can create new sessions that work and access the data from my server and I can access old sessions that use mouse (mm8 or 9), drosophila or xenopus genomes.
Is something broken at the UCSC end?
David

David H. Price
Professor of Biochemistry
University of Iowa
375 Newton Rd.
Iowa City, IA 52242

(319) 335-7910
http://www.uiowa.edu/~pricelab/



=============================================================================
Topic: Non-Human RefSeq Genes
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/53552d8c248613a
=============================================================================

---------- 1 of 1 ----------
From: Brian Smith <***@gmail.com>
Date: Jul 22 10:51AM -0400
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/3a00dcf14a46c99f

How can I get the UCSC genome browser to display species that are not yet
displayed on the track?

thanks!



=============================================================================
Topic: Table Browser RefSeq Genes refGene GTF Output
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/e6412c75de707e1c
=============================================================================

---------- 1 of 2 ----------
From: Andy Rampersaud <***@bu.edu>
Date: Jul 22 09:50AM -0400
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/4f3ebb531df9edc4

Hi Jonathan,

Thank you for your helpful solution. I was successful in getting the GTF
file I was seeking. Minor question: Are the kent command utility tools
available for Linux 32 bit operating systems? I have access to a 64 bit
server and was able to get the tool working but I was just curious.

2nd Question:

I was hoping to find the UCSC source for a GTF file I attained from a
fellow student. The path to the GTF gene file:

genome/mm9bowtie2/Mus_musculus/UCSC/mm9/Annotation/Archives/archive-2012-03-09-05-07-56/Genes/genes.gtf

I have also attached a file listing of this directory
(genome_file_list.txt).

I would like to know where/how one would go to download this folder from
UCSC? I basically want to make sure I'm using the most up-to-date gene.gtf
file.

Thanks,
Andy
--
Andy Rampersaud
Graduate Student, Bioinformatics
Waxman Lab, Boston University


---------- 2 of 2 ----------
From: Andy Rampersaud <***@bu.edu>
Date: Jul 22 09:52AM -0400
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/c7ac4fa3486973f2

(Attachment added)

Hi Jonathan,

Thank you for your helpful solution. I was successful in getting the GTF
file I was seeking. Minor question: Are the kent command utility tools
available for Linux 32 bit operating systems? I have access to a 64 bit
server and was able to get the tool working but I was just curious.

2nd Question:

I was hoping to find the UCSC source for a GTF file I attained from a
fellow student. The path to the GTF gene file:

genome/mm9bowtie2/Mus_musculus/UCSC/mm9/Annotation/Archives/archive-2012-03-09-05-07-56/Genes/genes.gtf

I have also attached a file listing of this directory
(genome_file_list.txt).

I would like to know where/how one would go to download this folder from
UCSC? I basically want to make sure I'm using the most up-to-date gene.gtf
file.

Thanks,
Andy
--
Andy Rampersaud
Graduate Student, Bioinformatics
Waxman Lab, Boston University



=============================================================================
Topic: Problem in MAF files?
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/17fe254789c0506e
=============================================================================

---------- 1 of 1 ----------
From: Koustav Pal <***@gmail.com>
Date: Jul 22 10:35AM +0530
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/9e1d2efbab01cde5

I downloaded the rhemac2 pairwise alignments for hg19 pantro3 gorgor3
ponabe2 and caljac3 the lines in these files were as such

a score=26865.0
s chr10 56462 33 + 94855758 CTTCCTGATCGTGTGGTCTATGACTCTACCCCT
s chr20 17556 33 - 62736349 CGTCCTGCCCGTATGGTCTATGACTCCACCCCT

I did a multiz of these files and later on while running phastcons i
realized that tree had to be provided, a tree cannot be provided without
editing the lines, therefore i edited the headers with the so that i looked

a score=26865.0
s rheMac2.chr10 56462 33 + 94855758 CTTCCTGATCGTGTGGTCTATGACTCTACCCCT
s calJac3.chr20 17556 33 - 62736349 CGTCCTGCCCGTATGGTCTATGACTCCACCCCT

i ran multiz on these files once again but this time multiz gave me an error

*line 11 of organism1.organism2.maf : inconsistent row size*
*
*
can someone please suggest a fix to the issue?
and why is it that the organism name is not included in the pairwise
alignment files?
--
Regards,
Koustav Pal,
Junior Project Fellow
Vinod Scaria Labs,
Open Source Drug Discovery Project,
CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB),
New Delhi, India.



=============================================================================
Topic: phylop acceleration score
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/48e3bbcd6681ad0b
=============================================================================

---------- 1 of 1 ----------
From: Neeba Dijo <***@gmail.com>
Date: Jul 22 09:17AM +0530
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/c7f1b902e9860d52

Hi Team,

I download the phylop score for all chromosome from the location

ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/phyloP46way/placentalMammals/

I found only the conservation score.

Some people mentioned that phylop have acceleration score.

Is it available ? If so
From where we can download this core?
--
*Thanks & Regards,*
*Neeba Sebastian*



=============================================================================
Topic: obtaining figure quality image
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/62d352b3db0b9364
=============================================================================

---------- 1 of 1 ----------
From: Theresa Thu Dinh <***@stanford.edu>
Date: Jul 21 01:17PM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/f8682b66d4c06e64

To whom it may concern:

I would like to obtain figure quality screen shot from the genome browser for a paper we would like to submit. I was wondering if you could please guide me on how I can do this. Thank you very much.

Regards,
Theresa Dinh



=============================================================================
Topic: bigWigAverageOverBed accounts for strand information or not?
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/2c586b0bbb054106
=============================================================================

---------- 1 of 1 ----------
From: hulong <***@gmail.com>
Date: Jul 21 09:42PM +0800
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/69654abae57d122f

Dear All,
I'm Hu Long, a PhD student of Tsinghua University, China.
I have a question: Does the bigWigAverageOverBed consider the strand information? Like in Bedtools, IntersectBed can add "-s or -S" parameter to specific the strand, can bigWigAverageOverBed do the seem thing?
Thanks a lot! Looking forward of your reply!
Best

Hu Long.



=============================================================================
Topic: Model Generation With PhastCons
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/9650de2c5a312b6c
=============================================================================

---------- 1 of 1 ----------
From: Koustav Pal <***@gmail.com>
Date: Jul 21 01:07PM +0530
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/84ddfccd05f8fdd1

Hi,

Recently I downloaded the primate pairwise alignment files rhemac2, caljac3
pantro3 gorog3 ponabe2 hg19. As done by UCSC I have generated my MSA files
using multiz, but when I try to generated the cons.mod or noncons.mod file
from the same generated MSA data using phylofit I get the error "bad
integers or strand in MAF strand must be + for reference sequence".

I get it that it does not take - strand on the reference strand. But would
be very helpful If someone can tell me how to resolve this issue.
--
Regards,
Koustav Pal,
Junior Project Fellow
Vinod Scaria Labs,
Open Source Drug Discovery Project,
CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB),
New Delhi, India.



=============================================================================
Topic: Usage of data from your database for other database
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/a1235fbc126ac120
=============================================================================

---------- 1 of 2 ----------
From: Alexander Belostotsky <***@gmail.com>
Date: Jul 20 11:02PM +0400
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/26ff89a9344d92a8

P.S. One more question. Where can I get a file with full alias list
and description of proteins coding by genes? It is used in UCSC when I
enter gene name.
Thank you one more time! :)
--
С уважеМОеЌ,
Саша


---------- 2 of 2 ----------
From: Alexander Belostotsky <***@gmail.com>
Date: Jul 20 06:13PM +0400
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/7b9d8dffe9fef24

Dear Brooke!

Thank you for your answer!
I have another two questions.
1. Could I use all data for hg19 assembly for database (which is
public and not-profit)? Are there any restrictions on time, could it
be used right now? Cannot find restriction date.
2. Could all ChIP-seq, DNAse1 and FAIRE data for different tissues
(separate files) be downloaded as one directory from UCSC server? Is
it allowed and if yes, how can I do it?

Thank you for responses!

Best regards,
Sasha



--
С уважеМОеЌ,
Саша



=============================================================================
Topic: mouse mm9; all custom tracks disappeared
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/11aae550472f7391
=============================================================================

---------- 1 of 1 ----------
From: robert kuhn <***@soe.ucsc.edu>
Date: Jul 21 01:17PM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/5cbaa8362742d23c

Hello, Anne,


I see from your email address that you are writing from France.


Is it possible that you missed the announcement last month that
we opened a new server in Europe and are redirecting traffic?
From certain of our pages, users with a European IP address are
sent to genome-euro.ucsc.edu instead of genome.ucsc.edu.


If you visited one of those pages recently, you should have seen a
message that gave you the option to return to the US server. If you
chose to stay on the European server, then the Custom Tracks
associated with your Saved Sessions would not be there. They should
still be on the California server, however. We are sorry for any
inconvenience. Here is some information that should help you.


The announcement:

|http://genome-euro.ucsc.edu/goldenPath/newsarch.html#062713|

provides some information, including a link to some documentation
on the feature:

|http://genome.ucsc.edu/goldenPath/help/genomeEuro.html|

To summarize:

Your Custom Tracks should still be associated with your sessions on
the US server (though they do get cleaned if they are unused for a
significant period of time). You have two options:


1. Continue to use the US server as before, with your Custom Tracks
intact. To get there, use the "Mirrors" link at the top of the Browser
and go to "US Server."

2. Migrate your Custom Track data to genome-euro. The Saved
Sessions associated with the Custom Tracks should still be in place
on the new machine, but the custom content was not moved (as
you described). To get the tracks onto genome-euro, either reload
them from your original source, or visit the US Server and use the
Table Browser to download the data to a file. Then reload the file
onto the European Server.


In general, we recommend that our users keep a local copy of

their Custom Track data, just in case something catastrophic
happens on our end. We also recommend using the Track

Hub mechanism described here:


http://genome.ucsc.edu/goldenPath/help/hgTrackHubHelp.html


which gives you local ownership and control of your data, yet
supports full visualization on the Browser.


Best wishes and thanks for your patience while we complete the
transition to the new server. We hope that the European Server
will help improve performance for all of our users.


Do let us know via the mailing list if this does not solve your
problem or it other issues arise.

| --b0b kuhn
ucsc genome bioinformatics group

|

On 7/19/2013 8:56 AM, Gendrel Anne Valerie wrote:




--
g***@soe.ucsc.edu
2013-07-24 17:13:47 UTC
Permalink
=============================================================================
Today's Topic Summary
=============================================================================

Group: ***@soe.ucsc.edu
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/topics

- phylop acceleration score [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/48e3bbcd6681ad0b
- How to create a GTF for mitochondrial genes [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/4a9a7fee52c03a8
- Blat Intron questions [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/846e5f443b7593fd
- imp [2 Updates]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/f6613b8a91aac199
- In Silico PCR no match [2 Updates]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/448e1860073c9f58
- Table Browser RefSeq Genes refGene GTF Output [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/e6412c75de707e1c
- retrieving complete set of exons (+ coordinates and reading frame positions) [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/f9bc6c7b94973dd9
- bigWigAverageOverBed accounts for strand information or not? [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/2c586b0bbb054106
- need help [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/198c109b69c500b2
- Blat configuration Question [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/208a986432612aed
- Non-Human RefSeq Genes [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/53552d8c248613a
- Will the DGV Struct Var track be updated? [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/896b24a4368f4d48
- Sessions using human genomes not working [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/187dc0439120e001
- New public hub to add [1 Update]
http://groups.google.com/a/soe.ucsc.edu/group/genome/t/88a21325863e98dc


=============================================================================
Topic: phylop acceleration score
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/48e3bbcd6681ad0b
=============================================================================

---------- 1 of 1 ----------
From: "Steve Heitner" <***@soe.ucsc.edu>
Date: Jul 24 09:50AM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/21f3d8242552d9cb

Hello, Neeba.

The answer can be found by reading the Description section of the hg19
Conservation track description page
(http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19
<http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=cons46way>
&g=cons46way). The third paragraph states:

Another important difference is that phyloP can measure acceleration (faster
evolution than expected under neutral drift) as well as conservation (slower
than expected evolution). In the phyloP plots, sites predicted to be
conserved are assigned positive scores (and shown in blue), while sites
predicted to be fast-evolving are assigned negative scores (and shown in
red).

Please contact us again at ***@soe.ucsc.edu if you have any further
questions.

---
Steve Heitner
UCSC Genome Bioinformatics Group



From: ***@soe.ucsc.edu [mailto:***@soe.ucsc.edu] On Behalf Of Neeba
Dijo
Sent: Sunday, July 21, 2013 8:47 PM
To: ***@soe.ucsc.edu
Subject: [genome] phylop acceleration score



Hi Team,

I download the phylop score for all chromosome from the location

ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/phyloP46way/placentalMammals/

I found only the conservation score.

Some people mentioned that phylop have acceleration score.

Is it available ? If so
From where we can download this core?
--
Thanks & Regards,

Neeba Sebastian

--



=============================================================================
Topic: How to create a GTF for mitochondrial genes
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/4a9a7fee52c03a8
=============================================================================

---------- 1 of 1 ----------
From: Zhuohui Gan <***@eng.ucsd.edu>
Date: Jul 23 08:10PM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/a073a9e9b1bfdc01

Dear UCSC genome group,

I am trying to create a GTF for mouse mitochondrial genes for my TopHatmapping.
I found a similar discussion, but seems the method works no longer since
the browser interface seems updated.
http://redmine.soe.ucsc.edu/forum/index.php?t=msg&goto
=12004&S=0f7b762f8a68fc1bf248888c36eaf0a8

Would you assist me to create a GTF for mouse mitochondrial genes using
your browser?

Thanks,
Zoe



=============================================================================
Topic: Blat Intron questions
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/846e5f443b7593fd
=============================================================================

---------- 1 of 1 ----------
From: Jason Baumohl <***@lbl.gov>
Date: Jul 23 05:03PM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/d8e4c7fe7c17f6e9

Blat Introns:

I assume "-intron=" value is maximum gap between two blocks?
If so is 100000 too big of a value for Blat to handle?
Is there a way to limit the number of blocks?

Thanks,
Jason



=============================================================================
Topic: imp
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/f6613b8a91aac199
=============================================================================

---------- 1 of 2 ----------
From: "anirban mukherjee" <***@rediffmail.com>
Date: Jul 23 11:37PM
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/47541d0fc7a7c03e

Dear sir / mam ,                                           I would be glad if you can tell me the shortest gene in human being.
                                                             Closing with regards,

Yours sincerely,

ANIRBAN MUKHERJEE

E-mail address - ***@rediffmail.com


---------- 2 of 2 ----------
From: robert kuhn <***@soe.ucsc.edu>
Date: Jul 23 04:57PM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/78d19a5c0ab4b438

Dear Anirban,

From our database (which you can query yourself - see
http://genome.ucsc.edu/goldenPath/help/mysql.html)

mysql> SELECT name, geneSymbol, txEnd - txStart AS size FROM knownGene
k, kgXref x WHERE proteinID != "" AND k.name = x.kgId ORDER BY size
LIMIT 2;
+------------+------------+------+
| name | geneSymbol | size |
+------------+------------+------+
| uc021qqb.1 | SLN | 96 |
| uc031rnu.1 | OST4 | 114 |
+------------+------------+------+

regards,

--b0b kuhn
ucsc genome bioinformatics group

On 7/23/2013 4:37 PM, anirban mukherjee wrote:



=============================================================================
Topic: In Silico PCR no match
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/448e1860073c9f58
=============================================================================

---------- 1 of 2 ----------
From: "Kim, Jake (NIH/NIMH) [F]" <***@nih.gov>
Date: Jul 23 06:18PM
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/378c90ca91b228c

Hello,

I used the UCSC In-Silico PCR to locate the following PCR primers:
Forward primer: GGATGTCCCCAAGCATCATT
Reverse primer: TTTGAGACCAGCCTGACCAA

However, it says "No matches to GGATGTCCCCAAGCATCATT TTTGAGACCAGCCTGACCAA in Human Feb. 2009 (GRCh37/hg19)."

I went back to BLAT Search Genome and it finds the forward primer but not the reverse primer. I double checked the reverse primer and it is complementary


---------- 2 of 2 ----------
From: Brian Lee <***@soe.ucsc.edu>
Date: Jul 23 04:20PM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/6b7862101f8abfb0

Dear Jake,

Thank you for using the UCSC Genome Browser and your question about
In-Silico PCR, and the helpful details you provided to assist in a response.

A UCSC browser engineer examined the section and discovered that the
reverse primer falls into an area of repeats, here is a link to a session
displaying the primers and the RepeatMasker track:
http://genome.ucsc.edu/cgi-bin/hgTracks?hgS_doOtherUser=submit&hgS_otherUserName=brianlee&hgS_otherUserSessionName=Primers%20Blat%20Search

An engineer explained that at 20 bases, your sequences are near the limit
of BLAT's sensitivity. With such short sequences, if BLAT has an over-used
tile, a section of the query used to match and trigger an alignment, the
algorithm will not be able to seed and extend an original BLAT hit in an
area of repeats, which is likely happening with your reverse primer
sequence.

You can see in the above session that a larger BLAT query will result in a
hit. Since primers are typically chosen from unique locations, our engineer
suggests it might be best to avoid a repeat-region, and notes that the
nearby chr13:50,593,214-50,593,267 section appears to not have any repeats.

Thank you again for your inquiry and using the UCSC Genome Browser. If you
have further questions, please feel free to contact the mailing list again
at ***@soe.ucsc.edu.

All the best,

Brian Lee
UCSC Genome Bioinformatics Group


On Tue, Jul 23, 2013 at 11:18 AM, Kim, Jake (NIH/NIMH) [F] <***@nih.gov



=============================================================================
Topic: Table Browser RefSeq Genes refGene GTF Output
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/e6412c75de707e1c
=============================================================================

---------- 1 of 1 ----------
From: Jonathan Casper <***@soe.ucsc.edu>
Date: Jul 23 04:17PM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/963aa2b7da32d411

Hello Andy,

We make binaries of the command line utilities available for several
systems, but 32-bit Linux is not one of them. If you go to our downloads
page at http://hgdownload.soe.ucsc.edu and scroll down to the "Source
Downloads" section, you'll find a link to instructions for downloading the
source code for these utilities and building them on your own system.

Unfortunately, I can't really tell where your colleague obtained that GTF
file from. The file listing you provided doesn't look like anything on our
downloads server. In any case, the most up-to-date version of a genes list
in GTF format you're likely to find is the file you just generated with
genePredToGtf.

I hope this is helpful. If you have any further questions, please reply to
***@soe.ucsc.edu.

--
Jonathan Casper
UCSC Genome Bioinformatics Staff





=============================================================================
Topic: retrieving complete set of exons (+ coordinates and reading frame positions)
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/f9bc6c7b94973dd9
=============================================================================

---------- 1 of 1 ----------
From: Max Shpak <***@gmail.com>
Date: Jul 23 05:20PM -0500
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/32c325b6f06d4ca0

Hello,
I have been trying to figure out how to obtain a list of exons given a
set of gene names (and eventually the set of all genes with hgnc referenced
names) from the UCSC genome browser. I can't seem to locate the field for
doing so.
In addition, I need to obtain the coordinates (chromosome number, start
and end for each exon), as well as the reading frame (e.g. if the exon is
ACGACAT..., whether the first translated codon is ACG, CGA, GAC). Is there
some straightforward way to extract this information?
Thank you, Max Shpak
--
=======================
Max Shpak, Ph.D.
NeuroTexas Institute
St. David's Medical Center
1015 East 32nd Street, Suite 404
Austin, TX 78705
(512) 544-8077



=============================================================================
Topic: bigWigAverageOverBed accounts for strand information or not?
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/2c586b0bbb054106
=============================================================================

---------- 1 of 1 ----------
From: "Steve Heitner" <***@soe.ucsc.edu>
Date: Jul 23 03:24PM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/7859f00680a2fd99

Hello, Hu.

The bigWigAverageOverBed utility does not have a strand-specific parameter.
Wiggle values are not strand-specific and only contain one value per base.

Please contact us again at ***@soe.ucsc.edu if you have any further
questions.

---
Steve Heitner
UCSC Genome Bioinformatics Group

-----Original Message-----
From: ***@soe.ucsc.edu [mailto:***@soe.ucsc.edu] On Behalf Of hulong
Sent: Sunday, July 21, 2013 6:43 AM
To: ***@soe.ucsc.edu
Subject: [genome] bigWigAverageOverBed accounts for strand information or
not?

Dear All,
I'm Hu Long, a PhD student of Tsinghua University, China.
I have a question: Does the bigWigAverageOverBed consider the strand
information? Like in Bedtools, IntersectBed can add "-s or -S" parameter to
specific the strand, can bigWigAverageOverBed do the seem thing?
Thanks a lot! Looking forward of your reply!
Best

Hu Long.

--



=============================================================================
Topic: need help
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/198c109b69c500b2
=============================================================================

---------- 1 of 1 ----------
From: Brian Lee <***@soe.ucsc.edu>
Date: Jul 23 03:11PM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/efac58990b4b72bc

Dear Artur,

Thank you for using the UCSC Genome Browser and your question about
uploading a file generated from your Torrent Server Personal Genome Machine
(PGM) System.

To submit your variant calls on our Variant Annotation Integrator, you must
provide information in either the Personal Genome SNP (pgSnp) format or the
Variant Call Format (VCF). While pgSnp-formatted variants may be uploaded
as text in a Custom Track, VCF files are compressed and must be submitted
with an indexed VCF file on a web server (HTTP, HTTPS or FTP), as well as
being configured as a Custom Track (or if you happen to have a Track Hub,
as hub tracks).

When a VCF file is compressed and indexed using tabix, and made
web-accessible, the Genome Browser can fetch only the portions of the file
necessary to provide annotated items in the VAI. This makes it possible to
access variants from files that are so large that the connection to UCSC
would time out when attempting to upload the whole file to UCSC.

Please see this page about VCF format for details,
http://genome.ucsc.edu/goldenPath/help/vcf.html.

By following the VCF custom track workflow, you can construct a custom
track with a line like: track type=vcfTabix name="My VCF" bigDataUrl=
http://myorg.edu/mylab/my.vcf.gz. Please note, that in addition to the VCF
file at http://myorg.edu/mylab/my.vcf.gz, the associated index file,
http://myorg.edu/mylab/my.vcf.gz.tbi, must also be available at the same
(example) location.

Thank you again for your inquiry and using the UCSC Genome Browser. If you
have further questions, please feel free to contact the mailing list again
at ***@soe.ucsc.edu.

All the best,

Brian Lee
UCSC Genome Bioinformatics Group





=============================================================================
Topic: Blat configuration Question
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/208a986432612aed
=============================================================================

---------- 1 of 1 ----------
From: "Steve Heitner" <***@soe.ucsc.edu>
Date: Jul 23 03:11PM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/21330f14af24f5e6

Hello, Jason.

For clarification, the default blat step size is actually 11. Also, in your
command line, you specify a minScore value of 85. Your 39 bp short read can
only have a maximum score of 39, so you are automatically excluding any
possible results for this query.

Please contact us again at ***@soe.ucsc.edu if you have any further
questions.

---
Steve Heitner
UCSC Genome Bioinformatics Group



From: ***@soe.ucsc.edu [mailto:***@soe.ucsc.edu] On Behalf Of Jason
Baumohl
Sent: Monday, July 22, 2013 4:11 PM
To: ***@soe.ucsc.edu
Subject: [genome] Blat configuration Question



Hello,

I am playing around with Blat and making sure I understand the results and
getting the results I expect.

My hope is to use Blat for mapping microarray probes to a genome.

However, when I create a shorter read case (39 bp), I do not get results.

I installed using this command:
wget http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/blat/blat

The program gives me all the expected results except the shorter sequence
result.
kb|g.1079.peg.2033_multiple_long
atgatgaagcgagcaagaattatttataatcctacttctgggcgtgagctatttaagaagagcttaccggaagtat
tacaaaaattagaacaagctggatatgaagcatcttgtcatgcgacaacgggtcctggagacgcaactgtggcggc
gaggcaagctgtaga
kb|multiple_short
atgatgaagcgagcaagaattatttataatcctacttct
kb|g.1079.peg.4927
gaattccgcatcggtacagcaattaaagaagcaactgaagaaggtattatcgttgcgaacggtgatgatactgaat
taattaagtctgaaacagtagtttgggctgcaggtgttcgtggtaacggtattgtggaagagtctggctttgaagc
aatgcgcggacgtgt
kb|g.1079.peg.4927_last_base
gaattccgcatcggtacagcaattaaagaagcaactgaagaaggtattatcgttgcgaacggtgatgatactgaat
taattaagtctgaaacagtagtttgggctgcaggtgttcgtggtaacggtattgtggaagagtctggctttgaagc
aatgcgcggacgtga
kb|g.1079.peg.4927_first_base
aaattccgcatcggtacagcaattaaagaagcaactgaagaaggtattatcgttgcgaacggtgatgatactgaat
taattaagtctgaaacagtagtttgggctgcaggtgttcgtggtaacggtattgtggaagagtctggctttgaagc
aatgcgcggacgtgt
kb|g.1079.peg.4927_middle_base_grun
gaattccgcatcggtacagcaattaaagaagcaactgaagaaggtattatcgttgcgaacggtgatgatactgaat
taattaagtctgaaacagtagtttcggctgcaggtgttcgtggtaacggtattgtggaagagtctggctttgaagc
aatgcgcggacgtgt
kb|g.1079.peg.4927_deletion_grun
gaattccgcatcggtacagcaattaaagaagcaactgaagaaggtattatcgttgcgaacggtgatgatactgaat
taattaagtctgaaacagtagtttggctgcaggtgttcgtggtaacggtattgtggaagagtctggctttgaagca
atgcgcggacgtgt
kb|g.1079.peg.4927_insertion_grun
gaattccgcatcggtacagcaattaaagaagcaactgaagaaggtattatcgttgcgaacggtgatgatactgaat
taattaagtctgaaacagtagtttggggctgcaggtgttcgtggtaacggtattgtggaagagtctggctttgaag
caatgcgcggacgtgt



I purposely copied and pasted the kb|g.1079.peg.2033_multiple_long sequence
into the database file (my contigs file)

the kb|multiple_shortis a subset of the kb|g.1079.peg.2033_multiple_long
sequence. So I would expect the short to found as a result.



I am running the following command:

/usr/local/bin/blat -t=dna -q=dna -tileSize=6 -minIdentity=85 -minMatch=0
-out=psl -maxIntron=150 -minScore=85 kb_g.1079_genome_multiple
test_query_multiple results_95multiple.psl

Here is the output.
psLayout version 3

match mis- rep. N's Q gap Q gap T gap T gap strand Q
Q Q Q T T T T block
blockSizes qStarts tStarts
match match count bases count bases name
size start end name size start end count
----------------------------------------------------------------------------
----------------------------------------------------------------------------
-------
167 0 0 0 0 0 0 0 +
kb|g.1079.peg.2033_multiple_long 167 0 167
kb|g.1079.c.2 52933 256 423 1 167, 0, 256,
167 0 0 0 0 0 0 0 +
kb|g.1079.peg.2033_multiple_long 167 0 167
kb|g.1079.c.1 239580 193280 193447 1 167, 0, 193280,
167 0 0 0 0 0 0 0 +
kb|g.1079.peg.2033_multiple_long 167 0 167
kb|g.1079.c.1 239580 0 167 1 167, 0, 0,
167 0 0 0 0 0 0 0 +
kb|g.1079.peg.2033_multiple_long 167 0 167
kb|g.1079.c.0 5214571 350999 351166 1 167, 0, 350999,
167 0 0 0 0 0 0 0 +
kb|g.1079.peg.2033_multiple_long 167 0 167
kb|g.1079.c.0 5214571 340689 340856 1 167, 0, 340689,
167 0 0 0 0 0 0 0 +
kb|g.1079.peg.2033_multiple_long 167 0 167
kb|g.1079.c.0 5214571 256 423 1 167, 0, 256,
167 0 0 0 0 0 0 0 -
kb|g.1079.peg.4927 167 0 167 kb|g.1079.c.0 5214571
4625967 4626134 1 167, 0, 4625967,
166 0 0 0 0 0 0 0 -
kb|g.1079.peg.4927_last_base 167 0 166 kb|g.1079.c.0
5214571 4625968 4626134 1 166, 1, 4625968,
166 0 0 0 0 0 0 0 -
kb|g.1079.peg.4927_first_base 167 1 167 kb|g.1079.c.0
5214571 4625967 4626133 1 166, 0, 4625967,
166 1 0 0 0 0 0 0 -
kb|g.1079.peg.4927_middle_base_grun 167 0 167
kb|g.1079.c.0 5214571 4625967 4626134 1 167, 0, 4625967,
166 0 0 0 0 0 1 1 -
kb|g.1079.peg.4927_deletion_grun 166 0 166
kb|g.1079.c.0 5214571 4625967 4626134 2 66,100, 0,66,
4625967,4626034,
167 0 0 0 1 1 0 0 -
kb|g.1079.peg.4927_insertion_grun 168 0 168
kb|g.1079.c.0 5214571 4625967 4626134 2 67,100, 0,68,
4625967,4626034,



Can anyone explain why the kb|multiple_short sequence is not being found?

Do I need to alter my command somehow?

It seems like the default blat step size is 5, so it appears by the
documentation it should have no problem with this seq length.

Thanks,

Jason

--



=============================================================================
Topic: Non-Human RefSeq Genes
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/53552d8c248613a
=============================================================================

---------- 1 of 1 ----------
From: "Steve Heitner" <***@soe.ucsc.edu>
Date: Jul 23 02:02PM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/b92a0360f6640944

Hello, Brian.

I'm not sure if you are trying to display your own data in this track or if
you would just like to see a species that isn't currently being displayed.

There are a number of species in the Browser that contain RefSeq data. If
an assembly has an Other RefSeq track, the contents from all other species'
RefSeq Genes tracks are included. For example, in the human assembly, all
Browser species other than human with RefSeq Genes tracks are displayed in
the human Other RefSeq track. That being said, all available Browser
species with a RefSeq Genes track are already displayed in the Other RefSeq
track and no additional species can be added. If a species is not included
in the Other RefSeq track, it is either hosted by UCSC and does not have a
RefSeq Genes track or it is not hosted by UCSC.

If you have your own data that you are trying to display, you have two
possible options:

1. If you have a specific sequence that you would like to align with the
human genome, you can perform a blat search
(http://genome.ucsc.edu/cgi-bin/hgBlat) and then click one of the "browser"
links on the results page to visually align your sequence with the human
sequence. Your blat results will be listed in the Browser as a separate
"track" with the title "Your Sequence from Blat Search".

2. If you have coordinate data that you would like to view alongside the
Other RefSeq track, you can load your coordinates as a custom track or even
a track hub. If you have never created a custom track, please see the
custom track help doc at
http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#CustomTracks. The
track hub documentation is located at
http://genome.ucsc.edu/goldenPath/help/hgTrackHubHelp.html.

Please contact us again at ***@soe.ucsc.edu if you have any further
questions.

---
Steve Heitner
UCSC Genome Bioinformatics Group



From: ***@soe.ucsc.edu [mailto:***@soe.ucsc.edu] On Behalf Of Brian
Smith
Sent: Monday, July 22, 2013 7:51 AM
To: ***@soe.ucsc.edu
Subject: [genome] Non-Human RefSeq Genes



How can I get the UCSC genome browser to display species that are not yet
displayed on the track?

thanks!

--



=============================================================================
Topic: Will the DGV Struct Var track be updated?
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/896b24a4368f4d48
=============================================================================

---------- 1 of 1 ----------
From: Brian Lee <***@soe.ucsc.edu>
Date: Jul 23 12:43PM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/44c8f55b33fdbf02

Dear Kie,

Thank you for your kind message and for alerting us to the broken links in
the current DGV track. We do plan to update this track in the next couple
of weeks, which will correct these currently displayed broken links.

Our engineers contacted DGV today and learned that since DGV's replacement
of "dgvbeta" with "dgv", for a select amount of dgv identifiers, those
starting with dgv_*, our links will not work. The majority of items,
however, should work, such as the nsv* and esv* links. Please know these
broken links should be a momentary issue for a select number of items until
we release the updated track in the coming weeks.

Thank you again for helping us improve and using the UCSC Genome Browser.
If you have further questions, please feel free to contact the mailing list
again at ***@soe.ucsc.edu.

All the best,

Brian Lee
UCSC Genome Bioinformatics Group





=============================================================================
Topic: Sessions using human genomes not working
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/187dc0439120e001
=============================================================================

---------- 1 of 1 ----------
From: "Steve Heitner" <***@soe.ucsc.edu>
Date: Jul 23 11:52AM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/f8b8a1e12d6be30b

Hello, David.

I assume that you have saved your sessions into files that you are now
trying to load. If this is the case, can you send us the files so we can
also attempt to load them? You can send them directly to me if you don't
want to send them to the entire list. Also, please let us know your
username so we can look into possible problems with your account.

We don't have any similar reports of problems loading existing sessions and
if you can load some existing sessions from files and not others, it may be
possible that the files have somehow become corrupt. In any event, we will
try to load them here and go from there.

Please contact us again at ***@soe.ucsc.edu if you have any further
questions.

---
Steve Heitner
UCSC Genome Bioinformatics Group



From: ***@soe.ucsc.edu [mailto:***@soe.ucsc.edu] On Behalf Of Price,
David H
Sent: Monday, July 22, 2013 8:09 AM
To: ***@soe.ucsc.edu
Subject: [genome] Sessions using human genomes not working



All sessions I have saved that access the hg18 or hg19 have stopped working.
I can hear the initial read (hard drive access) of the data off my server,
but it is truncated and then the session times out after about a minute.

I can create new sessions that work and access the data from my server and I
can access old sessions that use mouse (mm8 or 9), drosophila or xenopus
genomes.

Is something broken at the UCSC end?

David



David H. Price

Professor of Biochemistry

University of Iowa

375 Newton Rd.

Iowa City, IA 52242



(319) 335-7910

http://www.uiowa.edu/~pricelab/



--



=============================================================================
Topic: New public hub to add
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/t/88a21325863e98dc
=============================================================================

---------- 1 of 1 ----------
From: Jonathan Casper <***@soe.ucsc.edu>
Date: Jul 23 11:45AM -0700
Url: http://groups.google.com/a/soe.ucsc.edu/group/genome/msg/af966e19eeea9ad7

Hello David,

Beautiful work. Your hub has been added to the public track hub list.

Please let us know if you have any further questions.

--
Jonathan Casper
UCSC Genome Bioinformatics Staff






--
Loading...