models
¶
SQLAlchemy Object Relational Mapper (ORM) declarations, implemented as a set of classes.
All class attributes are Column
objects representing
columns of a SQL database table. Data types are detailed in the Attributes
section of each class.
base_entity
¶
Base SQLAlchemy ORM entities.
-
class
soweego.importer.models.base_entity.
BaseEntity
(**kwargs)[source]¶ Minimal ORM structure for a target catalog entry. Each ORM entity should inherit this class.
Attributes:
internal_id (integer) - an internal primary key
catalog_id (string(50)) - a target catalog identifier
name (text) - a full name (person), or full title (work)
name_tokens (text) - a name tokenized through
tokenize()
born (date) - a birth (person), or publication (work) date
born_precision (integer) - a birth (person), or publication (work) date precision
died (date) - a death date. Only applies to a person
died_precision (integer) - a death date precision. Only applies to a person
-
class
soweego.importer.models.base_entity.
BaseRelationship
(from_catalog_id, to_catalog_id)[source]¶ Minimal ORM structure for a target catalog relationship between entries. Each ORM relationship entity should implement this interface.
You can build a relationship for different purposes: typically, to connect works with people, or groups with individuals.
Attributes:
from_catalog_id (string(50)) - a target catalog identifier
to_catalog_id (string(50)) - a target catalog identifier
base_link_entity
¶
Base SQLAlchemy ORM entity for URLs.
-
class
soweego.importer.models.base_link_entity.
BaseLinkEntity
(**kwargs)[source]¶ Minimal ORM structure for a target catalog link/URL. Each ORM link entity should inherit this class.
Attributes:
internal_id (integer) - an internal primary key
catalog_id (string(50)) - a target catalog identifier
url (text) - a full URL
is_wiki (boolean) - whether a URL is a Wiki link or not
url_tokens (text) - a url tokenized through
tokenize()
base_nlp_entity
¶
Base SQLAlchemy ORM entity for textual data that will undergo some natural language processing (NLP).
-
class
soweego.importer.models.base_nlp_entity.
BaseNlpEntity
(**kwargs)[source]¶ Minimal ORM structure for a target catalog piece of text. Each ORM NLP entity should inherit this class.
Attributes:
internal_id (integer) - an internal primary key
catalog_id (string(50)) - a target catalog identifier
description (text) - a text describing the main catalog entry
description_tokens (text) - a description tokenized through
tokenize()
discogs_entity
¶
Discogs SQLAlchemy ORM entities.
-
class
soweego.importer.models.discogs_entity.
DiscogsArtistEntity
(**kwargs)[source]¶ A Discogs artist: either a musician or a band. It comes from the
_artists.xml.gz
dataset. See the download page.All ORM entities describing Discogs people should inherit this class.
Attributes:
real_name (text) - a name in real life
data_quality (string(20)) - an indicator of data quality
-
class
soweego.importer.models.discogs_entity.
DiscogsGroupEntity
(**kwargs)[source]¶ A Discogs group, namely a band.
-
class
soweego.importer.models.discogs_entity.
DiscogsGroupLinkEntity
(**kwargs)[source]¶ A Discogs band Web link (URL).
-
class
soweego.importer.models.discogs_entity.
DiscogsGroupNlpEntity
(**kwargs)[source]¶ A Discogs band textual description.
-
class
soweego.importer.models.discogs_entity.
DiscogsMasterArtistRelationship
(from_catalog_id, to_catalog_id)[source]¶ A relationship between a Discogs musical work and the Discogs musician or band who made it.
-
class
soweego.importer.models.discogs_entity.
DiscogsMasterEntity
(**kwargs)[source]¶ A Discogs master: a musical work, which can have multiple releases. It comes from the
_masters.xml.gz
dataset. See the download page.Attributes:
main_release_id (string(50)) - a Discogs identifier of the main release for this musical work
genres (text) - a string list of musical genres
-
class
soweego.importer.models.discogs_entity.
DiscogsMusicianEntity
(**kwargs)[source]¶ A Discogs musician.
imdb_entity
¶
IMDb SQLAlchemy ORM entities, based on the datasets specifications.
-
class
soweego.importer.models.imdb_entity.
IMDbNameEntity
(**kwargs)[source]¶ An IMDb name: a person like an actor, director, producer, etc. It comes from the
name.basics.tsv.gz
dataset. See the download pageAll ORM entities describing IMDb people should inherit this class.
Attributes:
gender (string(10)) - a gender
occupations (string(255)) - a string list of Wikidata occupation QIDs
-
class
soweego.importer.models.imdb_entity.
IMDbTitleEntity
(**kwargs)[source]¶ An IMDb title: an audiovisual work like a movie, short, TV series episode, etc. It comes from the
title.basics.tsv.gz
dataset. See the download pageAll ORM entities describing IMDb works should inherit this class.
Attributes:
title_type (string(100)) - an audiovisual work type, like movie or short
primary_title (text) - the most popular title
original_title (text) - a title in the original language
is_adult (boolean) - whether the audiovisual work is for adults or not
runtime_minutes (integer) - a runtime in minutes
genres (string(255)) - a string list of audiovisual genres
musicbrainz_entity
¶
MusicBrainz SQLAlchemy ORM entities, based on the database specifications.
-
class
soweego.importer.models.musicbrainz_entity.
MusicBrainzArtistBandRelationship
(from_catalog_id, to_catalog_id)[source]¶ A membership between a MusicBrainz artist and a MusicBrainz band.
-
class
soweego.importer.models.musicbrainz_entity.
MusicBrainzArtistEntity
(**kwargs)[source]¶ A MusicBrainz artist, namely a musician.
Attributes:
gender (string(10)) - a gender
birth_place (string(255)) - a birth place
death_place (string(255)) - a death place
-
class
soweego.importer.models.musicbrainz_entity.
MusicBrainzArtistLinkEntity
(**kwargs)[source]¶ A MusicBrainz musician Web link (URL).
-
class
soweego.importer.models.musicbrainz_entity.
MusicBrainzBandEntity
(**kwargs)[source]¶ A MusicBrainz band.
Attributes:
birth_place (string(255)) - a place where the band was formed
death_place (string(255)) - a place where the band was disbanded
-
class
soweego.importer.models.musicbrainz_entity.
MusicBrainzBandLinkEntity
(**kwargs)[source]¶ A MusicBrainz band Web link (URL).
-
class
soweego.importer.models.musicbrainz_entity.
MusicBrainzReleaseGroupArtistRelationship
(from_catalog_id, to_catalog_id)[source]¶ A relationship between a MusicBrainz musical work and the MusicBrainz musician or band who made it.
mix_n_match
¶
Mix’n’match SQLAlchemy ORM entities for catalogs that need curation.
They follow the catalog
and entry
tables of the s51434__mixnmatch_p
database located in
ToolsDB
under the Wikimedia
Toolforge
infrastructure. See how to
connect.