Hard to search assets by substring of title or filename

We have a growing number of assets, and the need to link these assets to entries in our content model. This means users need to be able to easily search for these assets within the Contentful web application.

Sadly the way Contentful currently matches search strings against assets makes the process painful.

For example, we have assets with similar, hyphenated titles like foo-document-2015.pdf, foo-document-2016.pdf, foo-document-2017.pdf, and so on (made up names for this example, but yes, these are asset titles).

On the Media page, and in the “Link existing assets” dialog when editing an entry, I can type a search term to filter assets. However, this doesn’t work how I expect, and I can’t figure out how it actually works.

For example, for a particular document with title foobar-baz-index-2018-q1.pdf, Contentful finds the asset with the search term title: foobar, but NOT title: baz. Perversely, it finds the asset with title: foobar-baz-index-2018-q1.pd, but not with title: foobar-baz-index-2018-q1.p or any other substring I have yet found.

My expectation is that Contentful would at least look for the search term as a substring of the filename and/or title of the asset. This doesn’t seem to happen. If I use the title filter, or the fileName filter, it still doesn’t help – and asset search doesn’t support MATCHES operators on these two filters.

Scrolling through page after page of assets is not a scalable answer.

This is frustrating. Any ideas?

Hi @avaragado,

You are correct and, at the moment, we have a limitation with “left anchored” search. Hyphenated strings are not further tokenized, so you unfortunately can’t perform some of the queries that you refer to. Still, I have already made our engineering team aware of this limitation.

Let me know if that makes sense or if you have any other questions. :grinning:

Thanks for your reply. Right now it seems the rule is “don’t use asset titles or filenames with hyphens”, which is not something I can easily enforce.

I guess a workaround is to add descriptions to the hyphenated assets, using spaces instead of hyphens. It’s clunky but it works.

Does the engineering team have any plans to change anything here? Any timescales?

We don’t have a clear plan or ETA in mind at the moment, but doing what you described should indeed help out (switching from hyphen to spaces). Has this workaround produced any undesired results up to this point? (e.g. retrieving unrelated assets)

I haven’t seen any problems yet.

We’re experiencing this issue and it’s very frustrating that it hasn’t been addressed in the last 3 years. There’s no simple way to filter media/assets by different file extension. With 13k images, what is the easiest approach to filter/find only our .svgs or .jpgs?

1 Like

Has anyone solved this yet?
Even through the API there is seemingly no reliable way to find all SVG images.