The Noble Quest Behind Semantic Search

November 25, 2016

A brief write-up at the ontotext blog, “The Knowledge Discovery Quest,” presents a noble vision of the search field. Philologist and blogger Teodora Petkova observed that semantic search is the key to bringing together data from different sources and exploring connections. She elaborates:

On a more practical note, semantic search is about efficient enterprise content usage. As one of the biggest losses of knowledge happens due to inefficient management and retrieval of information. The ability to search for meaning not for keywords brings us a step closer to efficient information management.

If semantic search had a separate icon from the one traditional search has it would have been a microscope. Why? Because semantic search is looking at content as if through the magnifying lens of a microscope. The technology helps us explore large amounts of systems and the connections between them. Sharpening our ability to join the dots, semantic search enhances the way we look for clues and compare correlations on our knowledge discovery quest.

At the bottom of the post is a slideshow on this “knowledge discovery quest.” Sure, it also serves to illustrate how ontotext could help, but we can’t blame them for drumming up business through their own blog. We actually appreciate the company’s approach to semantic search, and we’d be curious to see how they manage the intricacies of content conversion and normalization. Founded in 2000, ontotext is based in Bulgaria.

Cynthia Murrell, November 25, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Connectors, Data, Management, News, Search, Semantic, Technology | Comments Off on The Noble Quest Behind Semantic Search

Keeping Current with Elastic.co

November 24, 2016

Short honk. If you want to keep up with Elastic and Elasticsearch, the company’s “This Week in Elasticsearch and Apache Lucene” may be of interest. The weekly posting includes information about commits, releases, and training. Unlike the slightly crazed, revenue challenged open source search vendors, Elastic.co provides factual information about the plumbing for the search and retrieval system. We found the “Ongoing Changes” section useful and interesting. The idea is that one can keep track of certain features, methods, and issues by scanning a list. The short description of an issue, for instance, includes a link to additional information. Highly recommended for those hooked on Elastic.co’s free and open source solution or the for fee products and services the company offers.

Stephen E Arnold, November 24, 2016

Written by Stephen E. Arnold · Filed Under News, Search, Technology | 1 Comment

Do Not Forget to Show Your Work

November 24, 2016

Showing work is messy, necessary step to prove how one arrived at a solution. Most of the time it is never reviewed, but with big data people wonder how computer algorithms arrive at their conclusions. Engadget explains that computers are being forced to prove their results in, “MIT Makes Neural Networks Show Their Work.”

Understanding neural networks is extremely difficult, but MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) has developed a way to map the complex systems. CSAIL figured the task out by splitting networks in two smaller modules. One for extracting text segments and scoring according to their length and accordance and the second module predicts the segment’s subject and attempts to classify them. The mapping modules sounds almost as complex as the actual neural networks. To alleviate the stress and add a giggle to their research, CSAIL had the modules analyze beer reviews:

For their test, the team used online reviews from a beer rating website and had their network attempt to rank beers on a 5-star scale based on the brew’s aroma, palate, and appearance, using the site’s written reviews. After training the system, the CSAIL team found that their neural network rated beers based on aroma and appearance the same way that humans did 95 and 96 percent of the time, respectively. On the more subjective field of “palate,” the network agreed with people 80 percent of the time.

One set of data is as good as another to test CSAIL’s network mapping tool. CSAIL hopes to fine tune the machine learning project and use it in breast cancer research to analyze pathologist data.

Whitney Grace, November 24, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under AI, algorithms, Big data, News, Search, Technology, Tools | Comments Off on Do Not Forget to Show Your Work

Dawn of Blockchain Technology

November 24, 2016

Blockchain technology though currently powers the Bitcoin and other cryptocurrencies, soon the technology might find takers in mainstream commercial activities.

Blockgeeks in an in-depth article guide titled What Is Blockchain Technology? A Step-By-Step Guide for Beginners says:

The blockchain is an incorruptible digital ledger of economic transactions that can be programmed to record not just financial transactions but virtually everything of value.

Without getting into how the technology works, it would be interesting to know how and where the revolutionary technology can be utilized. Due to its inherent nature of being incorruptible due to human intervention and non-centralization, blockchain has numerous applications in the field of banking, remittances, shared economy, crowdfunding and many more, the list is just endless.

The technology will be especially helpful for people who transact over the Web and as the article points out:

Goldman Sachs believes that blockchain technology holds great potential especially to optimize clearing and settlements, and could represent global savings of up to $6bn per year.

Governments and commercial establishment, however, are apprehensive about it as blockchain might end their control over a multitude of things. Just because blockchain never stores data at one location. This also is the reason why Bitcoin is yet to gain full acceptance. But, can a driving force like blockchain technology that will empower the actual users can be stopped?

Vishal Ingole, November 24, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Applications, Financial, Government, News, Security, Technology | Comments Off on Dawn of Blockchain Technology

Writing That Is Never Read

November 23, 2016

It is inevitable in college that you were forced to write an essay. Writing an essay usually requires the citation of various sources from scholarly journals. As you perused the academic articles, the thought probably crossed your mind: who ever reads this stuff? Smithsonian Magazine tells us who in the article, “Academics Write Papers Arguing Over How Many People Read (And Cite) Their Papers.” In other words, themselves.

Academic articles are read mostly by their authors, journal editors, and the study’s author write, and students forced to cite them for assignments. In perfect scholarly fashion, many academics do not believe that their work has a limited scope. So what do they do? They decided to write about it and have done so for twenty years.

Most academics are not surprised that most written works go unread. The common belief is that it is better to publish something rather than nothing and it could also be a requirement to keep their position. As they are prone to do, academics complain about the numbers and their accuracy:

It seems like this should be an easy question to answer: all you have to do is count the number of citations each paper has. But it’s harder than you might think. There are entire papers themselves dedicated to figuring out how to do this efficiently and accurately. The point of the 2007 paper wasn’t to assert that 50 percent of studies are unread. It was actually about citation analysis and the ways that the internet is letting academics see more accurately who is reading and citing their papers. “Since the turn of the century, dozens of databases such as Scopus and Google Scholar have appeared, which allow the citation patterns of academic papers to be studied with unprecedented speed and ease,” the paper’s authors wrote.

Academics always need something to argue about, no matter how miniscule the topic. This particular article concludes on the note that someone should get the number straight so academics can move onto to another item to argue about. Going back to the original thought a student forced to write an essay with citations also probably thought: the reason this stuff does not get read is because they are so boring.

Whitney Grace, November 23, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Education, Google, News, Search, Technology, Tools | Comments Off on Writing That Is Never Read

Exit Shakespeare, for He Had a Coauthor

November 22, 2016

Shakespeare is regarded as the greatest writer in the English language. Many studies, however, are devoted to the theory that he did not pen all of his plays and poems. Some attribute them to Francis Bacon, Edward de Vere, Christopher Marlowe, and others. Whether Shakespeare was a singular author or one of many, two facts remain: he was a dirty, old man and it could be said he plagiarized his ideas from other writers. Shall he still be regarded as the figurehead for English literature?

Philly.com takes the Shakespeare authorship into question in the article, “Penn Engineers Use Big Data To Show Shakespeare Had Coauthor On ‘Henry VI’ Plays.” Editors of a new edition of Shakespeare’s complete works listed Marlowe as a coauthor on the Henry VI plays due to a recent study at the University of Pennsylvania. Alejandro Ribeiro used his experience researching networks could be applied to the Shakespeare authorship question using big data.

Ribeiro learned that Henry VI was among the works for which scholars thought Shakespeare might have had a co-author, so he and lab members Santiago Segarra and Mark Eisen tackled the question with the tools of big data. Working with Shakespeare expert Gabriel Egan of De Montfort University in Leicester, England, they analyzed the proximity of certain target words in the playwright’s works, developing a statistical fingerprint that could be compared with those of other authors from his era.

Two other research groups had the same conclusion with other analytical techniques. The results from all three studies were enough to convince the lead general editor of the New Oxford Shakespeare Gary Taylor, who decided to list Marlowe as a coauthor to Henry VI. More research has been conducted to determine other potential Shakespeare coauthors and six more will also be credited in the New Oxford editions.

Ribeiro and his team created “word-adjacency networks” that discovered patterns in Shakespeare’s writing style and six other dramatists. They discovered that many scenes in Henry VI were non-written in Shakespeare’s style, enough to prove a coauthor.

Some Shakespeare purists remain against the theory that Shakespeare did not pen all of his plays, but big data analytics proves many of the theories that other academics have theorized for generations. The dirty old man was not old alone as he wrote his ditties.

Whitney Grace, November 22, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Data, News, Search, Technology | Comments Off on Exit Shakespeare, for He Had a Coauthor

All the Things Watson Could Do

November 21, 2016

One of our favorite artificial intelligence topics has made the news again: Watson. Technology Review focuses on Watson’s job descriptions and his emergence in new fields, “IBM’s Watson Is Everywhere-But What Is It?” We all know that Watson won Jeopardy and has been deployed as the ultimate business intelligence solution, but what exactly does Watson do for a company?

The truth about Watson’s Jeopardy appearance is that very little of the technology was used. In reality, Watson is an umbrella name IBM uses for an entire group of their machine learning and artificial intelligence technology. The Watson brand is employed in a variety of ways from medical disease interpretation to creating new recipes via experimentation. The technology can be used for many industries and applied to a variety of scenarios. It all depends on what the business needs resolved. There is another problem:

Beyond the marketing hype, Watson is an interesting and potentially important AI effort. That’s because, for all the excitement over the ways in which companies like Google and Facebook are harnessing AI, no one has yet worked out how AI is going to fit into many workplaces. IBM is trying to make it easier for companies to apply these techniques, and to tap into the expertise required to do so.

IBM is experiencing problems of its own, but beyond those another consideration to take is Watson’s expense. Businesses are usually eager to incorporate new technology, if the benefit is huge. However, they are reluctant for the initial payout, especially if the technology is still experimental and not standard yet. Nobody wants to be a guinea pig, but someone needs to set the pace for everyone else. So who wants to deploy Watson?

Whitney Grace, November 21, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under AI, Facebook, Google, IBM Watson, News, Technology | Comments Off on All the Things Watson Could Do

Hacking the Internet of Things

November 17, 2016

Readers may recall that October’s DoS attack against internet-performance-management firm Dyn, which disrupted web traffic at popular sites like Twitter, Netflix, Reddit, and Etsy. As it turns out, the growing “Internet of Things (IoT)” facilitated that attack; specifically, thousands of cameras and DVRs were hacked and used to bombard Dyn with page requests. CNet examines the issue of hacking through the IoT in, “Search Engine Shodan Knows Where Your Toaster Lives.”

Reporter Laura Hautala informs us that it is quite easy for those who know what they’re doing to access any and all internet-connected devices. Skilled hackers can do so using search engines like Google or Bing, she tells us, but tools created for white-hat researchers, like Shodan, make the task even easier. Hautala writes:

While it’s possible hackers used Shodan, Google or Bing to locate the cameras and DVRs they compromised for the attack, they also could have done it with tools available in shady hacker circles. But without these legit, legal search tools, white hat researchers would have a harder time finding vulnerable systems connected to the internet. That could keep cybersecurity workers in a company’s IT department from checking which of its devices are leaking sensitive data onto the internet, for example, or have a known vulnerability that could let hackers in.

Even though sites like Shodan might leave you feeling exposed, security experts say the good guys need to be able to see as much as the bad guys can in order to be effective.

Indeed. Like every tool ever invented, the impacts of Shodan depend on the intentions of the people using it.

Cynthia Murrell, November 17, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Bing, Google, News, Search, Technology | Comments Off on Hacking the Internet of Things

Dark Web Marketplaces Are Getting Customer Savvy

November 17, 2016

Offering on Dark Web marketplaces are getting weirder by the day. Apart from guns, ammo, porn, fake identities, products like forged train tickets are now available for sale.

The Guardian in an investigative article titled Dark Web Departure: Fake Train Tickets Go on Sale Alongside AK-47s reveals that:

At least that’s the impression left by an investigation into the sale of forged train tickets on hidden parts of the internet. BBC South East bought several sophisticated fakes, including a first-class Hastings fare, for as little as a third of their face value. The tickets cannot fool machines but barrier staff accepted them on 12 occasions.

According to the group selling these tickets, the counterfeiting was done to inflict financial losses on the operators who are providing deficient services. Of course, it is also possible that the fake tickets are used by people (without criminalistics inclinations) who do not want to pay for the full fares.

One school of thought also says that like online marketplaces on Open Web, Dark Web marketplaces are also getting customer-savvy and are providing products and services that the customers need or want. This becomes apparent in this portion of the article:

The academics say the sites, once accessed by invitation or via dark-web search engines (there’ll be no hyperlinks here) resemble typical marketplaces such as Amazon or eBay, and that customer service is improving. “Agora was invitation-only but many of these marketplaces are easily accessible if you know how to search,” Dr Lee adds. “I think any secondary school student who knows how to use Google could get access – and that’s the danger of it.

One of the most active consumer group on Dark Web happens to be students, who are purchasing anything from fake certificates to hacker services to improve their grades and attendance records. Educational institutions, as well as law enforcement officials, are worried about this trend. And as more people get savvy with Dark Web, this trend is going to strengthen creating a parallel e-commerce, albeit a dark one.

Vishal Ingole, November 17, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Big data, Dark Web, ECommerce, Google, News, Technology | Comments Off on Dark Web Marketplaces Are Getting Customer Savvy

Black-Hat SEO Tactics Google Hates

November 16, 2016

The article on Search Engine Watch titled Guide to Black Hat SEO: Which Practices Will Earn You a Manual Penalty? follows up on a prior article that listed some of the sob stories of companies caught by Google using black-hat practices. Google does not take kindly to such activities, strangely enough. This article goes through some of those practices, which are meant to “falsely manipulate a website’s search position.”

Any kind of scheme where links are bought and sold is frowned upon, however money doesn’t necessarily have to change hands… Be aware of anyone asking to swap links, particularly if both sites operate in completely different niches. Also stay away from any automated software that creates links to your site. If you have guest bloggers on your site, it’s good idea to automatically Nofollow any links in their blog signature, as this can be seen as a ‘link trade’.

Other practices that earned a place on the list include automatically generated content, cloaking and irrelevant redirects, and hidden text and links. Doorway pages are multiple pages for a key phrase that lead visitors to the same end destination. If you think these activities don’t sound so terrible, you are in great company. Mozilla, BMW, and the BBC have all been caught and punished by Google for such tactics. Good or bad? You decide.

Chelsea Kerwin, November 16, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Written by Stephen E. Arnold · Filed Under Customer support, ECommerce, Google, News, Search, SEO, Technology, User experience | Comments Off on Black-Hat SEO Tactics Google Hates

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Employment
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

The Noble Quest Behind Semantic Search

Keeping Current with Elastic.co

Do Not Forget to Show Your Work

Dawn of Blockchain Technology

Writing That Is Never Read

Exit Shakespeare, for He Had a Coauthor

All the Things Watson Could Do

Hacking the Internet of Things

Dark Web Marketplaces Are Getting Customer Savvy

Black-Hat SEO Tactics Google Hates

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta