This project has moved. For the latest updates, please go here.

Structure resolution

Oct 17, 2012 at 9:51 AM

For what I've seen in the API, there's no way to get a PDB structure's resolution at this time, is that correct?

Thanks in advance.

Coordinator
Oct 17, 2012 at 10:02 AM

That's right, there is currently no resolution API and such information is ignored by our PDB parsers as well. But implementing this feature would be really straightforward. You may consider submitting a feature request in our issue tracker.

Oct 17, 2012 at 10:08 AM

Done!

Thanks.

Coordinator
Oct 21, 2012 at 7:46 AM

Your feature request is now implemented and will be part of a future release. If you want to get it now, just clone the repository and run the csb/build.py program. This will create a standard release package which you can install.

Oct 21, 2012 at 9:33 AM

Excellent. Thanks!

Oct 24, 2012 at 2:00 PM
Edited Oct 24, 2012 at 2:02 PM

I've been testing the resolution property, and it works just fine in the normal use case, when I do this:

par = StructureParser("pdb2trx.ent")
struct = par.parse_structure()
print struct.accession
print struct.resolution

I have a replicated version of the PDB repository, where all the files are gzipped. Therefore I was using a nasty hack (I'm aware that I'm accessing what should be a private property).

f = gzip.open("pdb2trx.ent.gz", 'rb')
par = StructureParser("/dev/null")
par._stream = f
struct = par.parse_structure()
print struct.accession
print struct.resolution
f.close()

In this second example, the accession displays ok, but the resolution returns None. I know it's not the way it's supposed to be used, but it would be a pain to gunzip all the files before processing. Is there any workaround you can think of to fix this?

Thanks a lot!

Coordinator
Oct 24, 2012 at 2:26 PM

Are your PDB files missing a SEQRES field? Only RegularStructureParser reads the resolution, LegacyStructureParser just assumes there is no header and completely ignores anything not in (HEADER, ATOM). This could be changed with some refactoring of course, the question is more whether it makes sense or not.

As for reading gzipped files, there are two ways to do it:

1) Create your own parser:

class GzipRegularStructureParser(RegularStructureParser):

    @property
    def filename(self):
        return self._file

    @filename.setter
    def filename(self, name):
        try:
            stream = gzip.open(name)   # that's really all you have to change
        except IOError:
            raise IOError('File not found: {0}'.format(name))
        
        if self._stream:
            try:
                self._stream.close()
            except:
                pass
        self._stream = stream
        self._file = name

2) Create a new FileSystemStructureProvider and use it to parse your structures (recommended). Something like:

class GzipStructureProvider(FileSystemStructureProvider):

    def get(self, id, model=None):
        
        pdb = self.find(id)
        
        if pdb is None:
            raise StructureNotFoundError(id)
        else:
            with csb.io.TempFile() as temp:
                text = gzip.open(pdb).read()  # consider wrapping this in a try..finally to close the gzip
                temp.write(text)
                temp.flush()
                return StructureParser(temp.name).parse_structure(model=model)

 

Oct 24, 2012 at 2:34 PM

Thanks for the quick reply. I'll try to build by own FileSystemStructureProvider then.

About the structures missing the SEQRES field, please notice that the file I'm parsing is always the same, the only difference being that first it's the gzipped version, and then the uncompressed pdb file. That's why it struck me as weird not getting the resolution when using the gzipped version, but getting it with the uncompressed one.

Thanks again!

Coordinator
Oct 24, 2012 at 2:41 PM

Strange. I'll have to debug it. Would it be possible to share one of those files with us?

Oct 24, 2012 at 2:52 PM

Certainly!

I'll create an issue and attach the files. This in the hope of helping you find some strange bug, because the solution you provided before seems of course like the right one ;)