Are you implementing a new index AM or a new table AM? Discarding data
based on something like a relevance score doesn't seem like something
that either API provides for. Indexes in Postgres can be lossy, but
that in itself doesn't change the result of queries.
(Sorry if this doesn't quote properly, I'm trying to figure out how to do the quote-and-bottom-post thing in gmail).
My plan was to do an index AM alone, but I'm thinking that isn't going to work. The goal is to do better full-text search in Postgres, fast, over really large datasets.
Relevance scoring is like an ORDER BY score with a LIMIT. The code that traverses the index needs to know both of these things in advance.
The GIN code doesn't cut it. I'm still trying to understand the code for the RUM index type, but it's slow going.
Suggestions on how to go about this are welcome.