Tutorial¶
Getting started¶
- Connect to
bibdev
:ssh bibdev
- Download alvisnlp_stfilter.tar.gz
- Uncompress it:
tar -x -z -f alvisnlp_stfilter.tar.gz
. This should create a directory namedstfilter
cd stfilter
- Load the environment to run AlvisNLP/ML:
source environ.sh
- Run:
alvisnlp stfilter.plan
Exercises¶
Observe stfilter.plan
and common.plan
¶
- What are
sequence
andimport
useful for? - Which files are read and written by AlvisNLP/ML?
- What is the meaning of
taxid,pos,rank
in module taxa? - Add a module to extract all gene occurrences.
- Why
fixed-forms
andfixed-forms-overlaps
are necessary? - What is classified by
train
? - What is the current discrimination performance? Use different classifiers instead of Naive Bayes.
- Where are learning attributes specified?
Observe attr/base.xml
¶
- How many learning attributes are there?
- What is the meaning of the
length
attribute? - What is the meaning of
bag
inattr/bow.xml
andattr/vici.xml
? What is the difference between one and another? - Which one performs best?
- Use word lemmas instead of surface forms. Does it improve the performance?
- Can you think of other attributes?
Using dependencies¶
- Change the POS tagging from
TreeTagger
toCCGPosTagger
. - Add
CCGParser
tocommon.plan
. - Wow! That takes too much time! Investigate how you could use
-dumpModule
and-resume
to your advantage. - Use dependencies in the learning attributes.