Thursday, September 5, 2019

When simple keyword searches affect your livelihood (and yes, my livelihood too)

Well, that was interesting.

Remember my Wednesday post that talked about content scraping? Here's the relevant portion for now:

I was following up on some links and I ran across an article that began like this.

Without revealing any specifics, let's just say that the links that I was following were sourced from a site that keeps track of particular keywords. For reasons that will soon become obvious, I am not going to reveal those keywords here, but let's just say that they rhyme with rompetitive shintelligence.

So basically this site conducts a simple keyword search and locates articles of interest that include those keywords. And anything that hits those keywords is automatically included on the site.

Even if it's poorly written garbled stuff such as "[Rompetitive shintelligence] collecting may be a beneficial exercising that yields important records to manual your enterprise and advertising strategy, or it can take a seat in a laptop document and accumulate the equal of digital dust in case you’re no longer care."

So my post was published on Wednesday.

On Thursday I went back to this site to keep what new stuff had been found. guessed it...there's a link to an Empoprise-BI post on the site - solely because my post included the words rompetitive shintelligence.

Now I'm not going to fault the site for using automated search techniques without human review. I guess I could, but I won't.

I am going to fault the search engines for not allowing more complex searches.

I use another site for my daily work that DOES allow such complex search statements, with a plethora of AND and OR and parentheses to help to make sure that I'm not getting a lot of "digital dust." And even then I have to manually review the results, and I get a number of false hits. For example, as I write this, the Alabama Sharpiegate stuff is showing up in my search results, even though it has nothing to do with what I truly want to see in my search.

Our current search tools are good, but they are not good enough.
blog comments powered by Disqus