July 21, 2016
Is big data good only for the hard sciences, or does it have something to offer the humanities? Writer Marcus A Banks thinks it does, as he states in, “Challenging the Print Paradigm: Web-Powered Scholarship is Set to Advance the Creation and Distribution of Research” at the Impact Blog (a project of the London School of Economics and Political Science). Banks suggests that data analysis can lead to a better understanding of, for example, how the perception of certain historical events have evolved over time. He goes on to explain what the literary community has to gain by moving forward:
“Despite my confidence in data mining I worry that our containers for scholarly works — ‘papers,’ ‘monographs’ — are anachronistic. When scholarship could only be expressed in print, on paper, these vessels made perfect sense. Today we have PDFs, which are surely a more efficient distribution mechanism than mailing print volumes to be placed onto library shelves. Nonetheless, PDFs reinforce the idea that scholarship must be portioned into discrete units, when the truth is that the best scholarship is sprawling, unbounded and mutable. The Web is flexible enough to facilitate this, in a way that print could never do. A print piece is necessarily reductive, while Web-oriented scholarship can be as capacious as required.
“To date, though, we still think in terms of print antecedents. This is not surprising, given that the Web is the merest of infants in historical terms. So we find that most advocacy surrounding open access publishing has been about increasing access to the PDFs of research articles. I am in complete support of this cause, especially when these articles report upon publicly or philanthropically funded research. Nonetheless, this feels narrow, quite modest. Text mining across a large swath of PDFs would yield useful insights, for sure. But this is not ‘data mining’ in the maximal sense of analyzing every aspect of a scholarly endeavor, even those that cannot easily be captured in print.”
Banks does note that a cautious approach to such fundamental change is warranted, citing the development of the data paper in 2011 as an example. He also mentions Scholarly HTML, a project that hopes to evolve into a formal W3C standard, and the Content Mine, a project aiming to glean 100 million facts from published research papers. The sky is the limit, Banks indicates, when it comes to Web-powered scholarship.
Cynthia Murrell, July 21, 2016
There is a Louisville, Kentucky Hidden Web/Dark
Web meet up on July 26, 2016.
Information is at this link: http://bit.ly/29tVKpx.
July 19, 2016
Deep learning is another bit of technical jargon floating around and it is tied to artificial intelligence. We know that artificial intelligence is the process of replicating human thought patterns and actions through computer software. Deep learning is…well, what specifically? To get a primer on what deep learning is as well as it’s many applications check out “Deep Learning: An MIT Press Book” by Ian Goodfellow, Yoshua Bengio, and Aaron Courville.
Here is how the Deeping Learning book is described:
“The Deep Learning textbook is a resource intended to help students and practitioners enter the field of machine learning in general and deep learning in particular. The online version of the book is now complete and will remain available online for free. The print version will be available for sale soon.”
This is a fantastic resource to take advantage of. MIT is one of the leading technical schools in the nation, if not the world, and the information that is sponsored by them is more than guaranteed to round out your deep learning foundation. Also it is free, which cannot be beaten. Here is how the book explains the goal of machine learning:
“This book is about a solution to these more intuitive problems. This solution is to allow computers to learn from experience and understand the world in terms of a hierarchy of concepts, with each concept de?ned in terms of its relation to simpler concepts. By gathering knowledge from experience, this approach avoids the need for human operators to formally specify all of the knowledge that the computer needs.”
If you have time take a detour and read the book, or if you want to save time there is always Wikipedia.
There is a Louisville, Kentucky Hidden Web/Dark
Web meet up on July 26, 2016.
Information is at this link: http://bit.ly/29tVKpx.
June 15, 2016
The Dark Web and deep web can often get misidentified and confused by readers. To take a step back, Trans Union’s blog offers a brief read called, The Dark Web & Your Data: Facts to Know, that helpfully addresses some basic information on these topics. First, a definition of the Dark Web: sites accessible only when a physical computer’s unique IP address is hidden on multiple levels. Specific software is needed to access the Dark Web because that software is needed to encrypt the machine’s IP address. The article continues,
“Certain software programs allow the IP address to be hidden, which provides anonymity as to where, or by whom, the site is hosted. The anonymous nature of the dark web makes it a haven for online criminals selling illegal products and services, as well as a marketplace for stolen data. The dark web is often confused with the “deep web,” the latter of which makes up about 90 percent of the Internet. The deep web consists of sites not reachable by standard search engines, including encrypted networks or password-protected sites like email accounts. The dark web also exists within this space and accounts for approximately less than 1 percent of web content.”
For those not reading news about the Dark Web every day, this seems like a fine piece to help brush up on cybersecurity concerns relevant at the individual user level. Trans Union is on the pulse in educating their clients as banks are an evergreen target for cybercrime and security breaches. It seems the message from this posting to clients can be interpreted as one of the “good luck” variety.
Megan Feil, June 15, 2016
May 23, 2016
Who exactly are today’s innovators? The Information Technology & Innovation Foundation (ITIF) performed a survey to find out, and shares a summary of their results in, “The Demographics of Innovation in the United States.” The write-up sets the context before getting into the findings:
“Behind every technological innovation is an individual or a team of individuals responsible for the hard scientific or engineering work. And behind each of them is an education and a set of experiences that impart the requisite knowledge, expertise, and opportunity. These scientists and engineers drive technological progress by creating innovative new products and services that raise incomes and improve quality of life for everyone….
“This study surveys people who are responsible for some of the most important innovations in America. These include people who have won national awards for their inventions, people who have filed for international, triadic patents for their innovative ideas in three technology areas (information technology, life sciences, and materials sciences), and innovators who have filed triadic patents for large advanced-technology companies. In total, 6,418 innovators were contacted for this report, and 923 provided viable responses. This diverse, yet focused sampling approach enables a broad, yet nuanced examination of individuals driving innovation in the United States.”
See the summary for results, including a helpful graphic. Here are some highlights: Unsurprisingly to anyone who has been paying attention, women and U.S.-born minorities are woefully underrepresented. Many of those surveyed are immigrants. The majority of survey-takers have at least one advanced degree (many from MIT), and nearly all majored in STEM subject as undergrads. Large companies contribute more than small businesses do while innovations are clustered in California, the Northeast, and close to sources of public research funding. And take heart, anyone over 30, for despite the popular image of 20-somethings reinventing the world, the median age of those surveyed is 47.
The piece concludes with some recommendations: We should encourage both women and minorities to study STEM subjects from elementary school on, especially in disadvantaged neighborhoods. We should also lend more support to talented immigrants who wish to stay in the U.S. after they attend college here. The researchers conclude that, with targeted action from the government on education, funding, technology transfer, and immigration policy, our nation can tap into a much wider pool of innovation.
Cynthia Murrell, May 23, 2016
March 23, 2016
Drug dealing is a shady business that takes place in a nefarious underground and runs discreetly under our noses. Along with drug dealing comes a variety of violence involving guns, criminal offenses, and often death. Countless people have lost their lives related to drug dealing, and that does not even include the people who overdosed. Would you believe that the drug dealing violence is being curbed by the Dark Web? TechDirt reveals, “How The Dark Net Is Making Drug Purchases Safer By Eliminating Associated Violence And Improving Quality.”
The Dark Web is the Internet’s underbelly, where stolen information and sex trafficking victims are sold, terrorists mingle, and, of course, drugs are peddled. Who would have thought that the Dark Web would actually provide a beneficial service to society by sending drug dealers online and taking them off the streets? With the drug dealers goes the associated violence. There also appears to be a system of checks and balances, where drug users can leave feedback a la eBay. It pushes the drug quality up as well, but is that a good or bad thing?
“The new report comes from the European Monitoring Centre for Drugs and Drug Addiction, which is funded by the European Union, and, as usual, is accompanied by an official comment from the relevant EU commissioner. Unfortunately, Dimitris Avramopoulos, the European Commissioner for Migration, Home Affairs and Citizenship, trots out the usual unthinking reaction to drug sales that has made the long-running and totally futile “war on drugs” one of the most destructive and counterproductive policies ever devised:
‘We should stop the abuse of the Internet by those wanting to turn it into a drug market. Technology is offering fresh opportunities for law enforcement to tackle online drug markets and reduce threats to public health. Let us seize these opportunities to attack the problem head-on and reduce drug supply online.’”
The war on drugs is a futile fight, but illegal substances do not benefit anyone. While it is a boon to society for the crime to be taken off the streets, take into consideration that the Dark Web is also a breeding ground for crimes arguably worse than drug dealing.
December 8, 2015
Museums are the cultural epicenters of the human race, because the house the highest achievements of art, science, history, and more. The best museums in the world are located in the populous cities and they house price works of art that represent the best of what humanity has to offer. The only problem about these museums is that they are in a stationary location and unless you have the luck to travel, you can’t see these fabulous works in person.
While books have often served as the gateway museums’ collection, it is not the same as seeing an object or exhibit in real life. The Internet with continuously evolving photographic and video technology have replicated museums’ collection as life like as possible without having to leave your home. The only problem with these digital collections are limited to what is within a museums’ archives, but what would happen if an organization collected all these artifacts in one place like a social networking Web site?
Google has done something extraordinary by creating the Google Cultural Institute. The Google Cultural Institute is part digital archive, part museum, part Pinterest, and part encyclopedia. It is described as:
“Discover exhibits and collections from museums and archives all around the world. Explore cultural treasures in extraordinary detail, from hidden gems to masterpieces.”
Users can browse collections of art, history, and science ranging from classical works to street art to the Holocaust and World War I. The Google Cultural Institute presents information via slideshows with captions. Collections are divided by subject and content as well as by the museum where the collections originate. Using Google Street View users can also view the very place where the collections are stored. Users can also make their own collections and share them like on Pinterest.
This is an amazing step towards bringing museums into the next step of their own evolution as well as allowing people who might not have the chance to access them see the collections. The only recommendation is that it would be nice if they put more advertising into the Google Cultural Institute so that people actually know it exists.
December 2, 2015
I know I write about search and content processing. But no one motivates more to think about finding “everything” about a matter like the legal eagles. Imagine my surprise when I read “Law School Grads Are Bombing the Bar, and It’s a Sign of Trouble for Legal Education.” LexisNexis and Westlaw, among other vendors may have to do more. What about the folks waiting for their student loan payments? What can they do? According to the write up:
In California, for example, passage rates for the exam in July hit the lowest point since 1986, with just 46.6% of total applicants and 60% of first-time test takers passing. In New York, the passage rate for first-time test takers dropped 4 percentage points since the July 2014 test, from 74% to 70%, hitting the lowest point since 2004. Passage rates also dipped in Washington, DC, Florida, Georgia, New Jersey, and Pennsylvania.
What do these featherless legal eagles do? Why not become search and content marketing professionals? Some failed webmasters, unemployed middle school teachers, and terminated middle managers have found little jet packs to lift them into economic prosperity.
Stephen E Arnold, December 2, 2015
November 5, 2015
I heard an interesting idea the other idea. Most parents think that when their toddler can figure out how to use a tablet that he or she is a genius, but did you ever consider that the real genius is the person who actually designed the tablet’s interface? Soon a software developer will be able to think their newest cognitive system is the next Einstein or Edison says Computer World in the article, “Machines Will Learn Just Like A Child, Says IBM CEO.”
IBM’s CEO Virginia Rometty said that technology is to the point where machines are almost close to reasoning. Current cognitive systems are now capable of understanding unstructured data, such as images, videos, songs, and more.
” ‘When I say reason it’s like you and I, if there is an issue or question, they take in all the information that they know, they stack up a set of hypotheses, they run it against all that data to decide, what do I have the most confidence in, ‘ Rometty said. The machine ‘can prove why I do or don’t believe something, and if I have high confidence in an answer, I can show you the ranking of what my answers are and then I learn.’ ”
The cognitive systems learn more as they are fed more data. There is a greater demand for machines that can process more data and are “smarter” and handle routines that make it useful.
The best news about machines gaining the learning capabilities of a human child is that they will not replace an actual human being, but rather augment our knowledge and current technology.
Whitney Grace, November 5, 2015
September 11, 2015
Academic publishers, such as Springer and Elsevier, have a monopoly on academic publishing and they do not want to lose their grasp. In the Slashdot science forum, a report from The Guardian was posted “Paywalled Science Journals Under Fire Again” describing how the academic publishers won a battle in Australia.
The Medical Journal of Australia (MJA) fired their editor Professor Stephen Leeder, when he expressed his displeasure over the journal outsourcing its functions to Elsevier. Leeder might have lost his job, but he will speak at a symposium at the State Library of NSW about ways academic communities can fight against the commoditization of knowledge.
What is concerning is that academic publishers are more interested in turning a profit than expanding humanity’s knowledge base:
“Alex Holcombe, an associate professor of psychology who will also be presenting at the symposium, said the business model of some of the major academic publishers was more profitable than owning a gold mine. Some of the 1,600 titles published by Elsevier charged institutions more than $19,000 for an annual subscription to just one journal. The Springer group, which publishes more than 2,000 titles, charges more than $21,000 for access to some of its titles. ‘The mining giant Rio Tinto has a profit margin of about 23%,’ Holcombe said. ‘Elsevier consistently comes in at around 37%. Open access publishing is catching on, but it requires researchers to pay up to $3000 to get a single open access article published.’”
Where does the pursuit of knowledge actually take place if researchers are at the mercy of academic publishers? One might say that researchers could publish their work for free on the Web, but remember that anyone can do that. Being published under a reputable banner adds to study’s authenticity and also helps it get used to support other research. The problem lies in the fact that big academic publishers limit who accesses their content to subscription holders and often those subscriptions are too expensive for the average researcher to afford on their own. Researchers want to have access to more academic content, but it is being locked down.
September 7, 2015
We cannot resist sharing this article with you, though it is only tangentially related to search; perhaps it has implications for the field of eDiscovery. Bloomberg Business asks and answers: “Are Lawyers Getting Dumber? Yes, Says the Woman who Runs the Bar Exam.”
Apparently, scores from the 2014 bar exam dropped significantly across the country compared to those of the previous year. Officials at the National Conference of Bar Examiners (NCBE), which administers the test, insist they carefully checked their procedures and found no problems on their end. They insist the fault lies squarely with that year’s crop of law school graduates, not with testing methods. Erica Moeser, head of the NCBE, penned a letter to law school officials informing them of the poor results, and advising they take steps to improve their students’ outcomes. To put it mildly, this did not go well with college administrators, who point out Moeser herself never passed the bar because she practices in Wisconsin, the only state in which the exam is not required to practice law.
So, who is right? Writer Natalie Kitroeff points out this salient information:
“Whether or not the profession is in crisis—a perennial lament—there’s no question that American legal education is in the midst of an unprecedented slump. In 2015 fewer people applied to law school than at any point in the last 30 years. Law schools are seeing enrollments plummet and have tried to keep their campuses alive by admitting students with worse credentials. That may force some law firms and consumers to rely on lawyers of a lower caliber, industry watchers say, but the fight will ultimately be most painful for the middling students, who are promised a shot at a legal career but in reality face long odds of becoming lawyers.”
The 2015 bar exam results could provide some clarification, but those won’t start coming out until sometime in September. See the article for much more information on Moeser, the NCBE, the bar exam itself, and the state of legal education today. Makers of eDiscovery software may want to beef up their idiot-proofing measures as much as possible, just to be safe.
Cynthia Murrell, September 7, 2015