05-10-2022 || 00:30
Tags: #spacy #nlp


Spacy matcher is a rule based matching.

Why not a regular expression?

  • Match on Doc objects for a more rule based matching
  • love can be an adjective or a verb
  • You can match the word using the pos_ parameter
import spacy

from spacy.matcher import Matcher

nlp = spacy.load('en_core_web_sm')

matcher = Matcher(nlp.vocab)

# Add the pattern to the matcher
pattern = [{"TEXT": "iPhone"}, {"TEXT": "X"}]
matcher.add("IPHONE_PATTERN", [pattern])

doc = nlp('Upcoming iPhone X release date leaked')

matches = matcher(doc)

for match_id, start, end in matches:
	print(start, end, doc[start: end])

This matcher can be used with spacy-pattern