Evaluations¶
Setting¶
run: April 11 2019 on
soweego-1
VPS instance;output folder:
/srv/dev/20190411
;head commit: 1505429997b878568a9e24185dc3afa7ad4720eb;
command:
python -m soweego linker evaluate ${Algorithm} ${Dataset} ${Entity}
;evaluation technique: stratified 5-fold cross validation over training/test splits;
mean performance scores over the folds.
Algorithms parameters¶
Naïve Bayes (NB):
binarize = 0.1;
alpha = 0.0001;
liblinear
SVM (LSVM): default parameters as per scikit LinearSVC;libsvm
SVM (SVM):kernel = linear;
other parameters as per scikit SVC defaults;
single-layer perceptron (SLP):
layer = fully connected (
Dense
);activation = sigmoid;
optimizer = stochastic gradient descent;
loss = binary cross-entropy;
training batch size = 1,024;
training epochs = 100.
multi-layer perceptron (MLP):
layers = 128 > BN > 32 > BN > 1
fully connected layers followed by BatchNormalization (BN)
activation:
hidden layers = relu;
output layer = sigmoid;
optimizer = Adadelta;
loss = binary cross-entropy
training batch size = 1,024;
training epochs = 1000;
early stopping:
patience = 100;
Performance¶
Algorithm |
Dataset |
Entity |
Precision (std) |
Recall (std) |
F-score (std) |
---|---|---|---|---|---|
NB |
Discogs |
Band |
.789 (.0031) |
.941 (.0004) |
.859 (.002) |
LSVM |
Discogs |
Band |
.785 (.0058) |
.946 (.0029) |
.858 (.0034) |
SVM |
Discogs |
Band |
.777 (.003) |
.963 (.0016) |
.86 (.0024) |
SLP |
Discogs |
Band |
.776 (.0041) |
.956 (.0012) |
.857 (.0029) |
NB |
Discogs |
Musician |
.836 (.0018) |
.958 (.0012) |
.893 (.0013) |
SVM |
Discogs |
Musician |
.814 (.0015) |
.986 (.0003) |
.892 (.001) |
SLP |
Discogs |
Musician |
.815 (.002) |
.985 (.0006) |
.892 (.0012) |
NB |
IMDb |
Actor |
TODO |
TODO |
TODO |
SVM |
IMDb |
Actor |
TODO |
TODO |
TODO |
SLP |
IMDb |
Actor |
TODO |
TODO |
TODO |
MLP |
IMDb |
Actor |
TODO |
TODO |
TODO |
NB |
IMDb |
Director |
.897 (.00195) |
.971 (.0012) |
.932 (.001) |
SVM |
IMDb |
Director |
.919 (.0031) |
.942 (.0019) |
.93 (.002) |
SLP |
IMDb |
Director |
.867 (.0115) |
.953 (.0043) |
.908 (.0056) |
NB |
IMDb |
Musician |
.891 (.0042) |
.96 (.0022) |
.924 (.0026) |
SVM |
IMDb |
Musician |
.917 (.0043) |
.937 (.0034) |
.927 (.003) |
SLP |
IMDb |
Musician |
.922 (.005) |
.914 (.0092) |
.918 (.0055) |
NB |
IMDb |
Producer |
.871 (.0023) |
.97 (.0037) |
.918 (.0011) |
SVM |
IMDb |
Producer |
.92 (.005) |
.938 (.0038) |
.929 (.0026) |
SLP |
IMDb |
Producer |
.862 (.0609) |
.914 (.0648) |
.883 (.0185) |
NB |
IMDb |
Writer |
.91 (.003) |
.961 (.0022) |
.935 (.0022) |
SVM |
IMDb |
Writer |
.936 (.0029) |
.948 (.0025) |
.942 (.0026) |
SLP |
IMDb |
Writer |
.903 (.0154) |
.955 (.0147) |
.928 (.0047) |
NB |
MusicBrainz |
Band |
.822 (.00169) |
.985 (.0008) |
.896 (.001) |
SVM |
MusicBrainz |
Band |
.943 (.0019) |
.888 (.0027) |
.914 (.0016) |
SLP |
MusicBrainz |
Band |
.93 (.0265) |
.885 (.0103) |
.907 (.0082) |
NB |
MusicBrainz |
Musician |
.955 (.0009) |
.936 (.0011) |
.946 (.00068) |
SVM |
MusicBrainz |
Musician |
.941 (.0011) |
.962 (.001) |
.952 (.0004) |
SLP |
MusicBrainz |
Musician |
.943 (.0018) |
.956 (.0019) |
.949 (.0007) |
Confidence¶
The following plots display the confidence scores distribution and the total predictions yielded by each algorithm on each target classification set.
Note that linear SVM is omitted since it does not output probability scores.
Axes:
x = # predictions;
y = confidence score.
Discogs band¶
Discogs musician¶
IMDb director¶
IMDb musician¶
IMDb producer¶
IMDb writer¶
MusicBrainz band¶
MusicBrainz musician¶
Comparison¶
See the plots above to have a rough idea on the amount of confident predictions.
Threshold values:
# predictions >= 0.0000000001, i.e., equivalent to almost all matches;
# confident >= 0.8.
Discogs band¶
WD items: 50,316
Measure |
NB |
LSVM |
SVM |
SLP |
MLP |
---|---|---|---|---|---|
Precision |
.789 |
.785 |
.777 |
.776 |
.833 |
Recall |
.941 |
.946 |
.963 |
.957 |
.914 |
F-score |
.859 |
.858 |
.86 |
.857 |
.872 |
# predictions |
820 |
51 |
94,430 |
91,295 |
91,132 |
# confident |
219 |
N.A. |
1,660 |
5,355 |
11,114 |
Discogs musician¶
WD items: 199,180
Measure |
NB |
LSVM |
SVM |
SLP |
MLP |
---|---|---|---|---|---|
Precision |
.836 |
.814 |
.815 |
.815 |
.849 |
Recall |
.958 |
.986 |
.985 |
.985 |
.961 |
F-score |
.893 |
.892 |
.892 |
.892 |
.902 |
# predictions |
3,872 |
200 |
533,301 |
517,450 |
514,488 |
# confident |
1,101 |
N.A. |
98,172 |
58,437 |
57,184 |
IMDb director¶
WD items: 9,249
Measure |
NB |
LSVM |
SVM |
SLP |
MLP |
---|---|---|---|---|---|
Precision |
.897 |
.919 |
.908 |
.867 |
.916 |
Recall |
.971 |
.942 |
.958 |
.953 |
.961 |
F-score |
.932 |
.93 |
.932 |
.908 |
.938 |
# predictions |
192 |
10 |
17,557 |
17,187 |
16,881 |
# confident |
60 |
N.A. |
1,616 |
553 |
1,810 |
IMDb musician¶
WD items: 217,139
Measure |
NB |
LSVM |
SVM |
SLP |
MLP |
---|---|---|---|---|---|
Precision |
.891 |
.917 |
.908 |
.922 |
.903 |
Recall |
.96 |
.937 |
.942 |
.914 |
.951 |
F-score |
.924 |
.927 |
.924 |
.918 |
.926 |
# predictions |
4,806 |
218 |
406,674 |
398,346 |
376,857 |
# confident |
1,341 |
N.A. |
21,462 |
7,244 |
16,272 |
IMDb producer¶
WD items: 2,251
Measure |
NB |
LSVM |
SVM |
SLP |
MLP |
---|---|---|---|---|---|
Precision |
.871 |
.92 |
.923 |
.862 |
.912 |
Recall |
.97 |
.938 |
.926 |
.914 |
.956 |
F-score |
.918 |
.929 |
.925 |
.883 |
.933 |
# predictions |
56 |
3 |
5,249 |
5,116 |
5,094 |
# confident |
15 |
N.A. |
507 |
180 |
529 |
IMDb writer¶
WD items: 16,446
Measure |
NB |
LSVM |
SVM |
SLP |
MLP |
---|---|---|---|---|---|
Precision |
.91 |
.936 |
.932 |
.903 |
.921 |
Recall |
.961 |
.948 |
.954 |
.955 |
.962 |
F-score |
.935 |
.942 |
.943 |
.928 |
.941 |
# predictions |
428 |
17 |
45,122 |
44,338 |
43,868 |
# confident |
138 |
N.A. |
2,934 |
1,548 |
3,234 |
MusicBrainz band¶
WD items: 32,658
Measure |
NB |
LSVM |
SVM |
SLP |
MLP |
---|---|---|---|---|---|
Precision |
.822 |
.943 |
.939 |
.93 |
.933 |
Recall |
.985 |
.888 |
.893 |
.885 |
.902 |
F-score |
.896 |
.914 |
.915 |
.907 |
.918 |
# predictions |
265 |
33 |
39,618 |
38,012 |
33,981 |
# confident |
46 |
N.A. |
1,475 |
501 |
1,506 |
MusicBrainz musician¶
WD items: 153,725
Measure |
NB |
LSVM |
SVM |
SLP |
MLP |
---|---|---|---|---|---|
Precision |
.955 |
.941 |
.95 |
.943 |
.940 |
Recall |
.936 |
.962 |
.938 |
.956 |
.968 |
F-score |
.946 |
.952 |
.944 |
.949 |
.954 |
# predictions |
2,833 |
154 |
280,029 |
260,530 |
194,505 |
# confident |
1,212 |
N.A. |
7,496 |
7,339 |
8,470 |