Monday, December 15, 2008

First Cut on Knowledge Management

A Brief Treatise on Knowledge Management
by Chris Manning
copyright Manning Applied Technology, all rights reserved
www.appl-tech.com
14 December 2008

Introduction

This white paper addresses knowledge management. The subject area is multifaceted and also includes or touches on artificial intelligence, project management, information management, and collaboration tools. This paper is cast in terms of engineering, but the concepts will be useful in almost all fields of endeavor. Critiques of this document are actively solicited from interested parties. Some reviewers will have useful insights on where technology is going in this subject area and may be able to point out relevant new developments.

Humans have only scratched the surface of how to use computers efficiently. The terminus of that trajectory is artificial intelligence, when the systems that humans have designed continue to optimize and extend themselves. "Real men write self-modifying code" is how one prescient programmer put it. Very few do knowledge creation, management, use and reuse as well as Toyota. Their development of personnel is equally noteworthy. The instant project in particular and collaborative projects in general will benefit greatly from effective knowledge management. Any approach undertaken at present requires discipline in two key areas; first, the knowledge must be entered into a data system in a useful format, and second, users must access and employ the knowledge base appropriately. Both of these tasks hinge on user interfaces and training. If it were easy, everyone would do it. Engineers are intrinsically adept at doing design work, discussing design work, and project history, but perhaps are not adept at entering data in computer readable formats. The modest fraction of engineers and scientists that write well tend to become managers.

Purposes

The advantages of effective information management in an engineering project can be summarized succinctly, roughly in order of importance:

1.efficient design execution; avoiding duplication of effort
2.management of subsequent manufacturing processes
3.troubleshooting of fielded systems
4.extension of system capabilities (i.e., subsequent reengineering)
5.knowledge and design reuse
6.estimation of return on investment (i.e., estimating overall project cost)
7.estimation of related project costs (similar future efforts)
8.failure analysis, forensics

Countless related areas can be identified. These include early identification of critical problems, sharing vendor/part information and design memes between projects and avoiding duplication of effort. In a sense, there are two types of duplication of effort, which both can be reduced by positive and negative feedback: sharing of successful design elements and sharing of problems to be avoided.

To the extent that project files, knowledge and information comprise company-proprietary information, access must be closely guarded. Further, to the extent that these files contain critical information that would be very expensive to replace, they must be backed up at independent locations. Access control is a straightforward function, as is backup. Both are routinely practiced in corporate information technology (IT) departments. However, for collaborative projects, access must be extended to subcontractors. Corporate IT policy must be satisfied by any vehicles used, and by their method of use.

Traditional solutions

Knowledge and information traditionally have been managed by writing reports and compiling build data repositories to hold assembly manuals, schematics, board layouts, and software files. These approaches are an acceptable baseline and enforce the capture discipline, to the extent that knowledge and information can be and are embedded successfully in these vehicles. A widespread and useful approach to managing data in these formats is the use of version control sofware. The two most common examples are Subversion from the open-source community and SourceSafe from Microsoft.

Improvements to the traditional approach

Thus, some aspects of improved knowledge/information management are straightforward. Simply putting project files in a common data structure and providing search tools (e.g., Google desktop tools) is a powerful step in the right direction. Each of the purposes described above may require navigating different paths through the data set. Ideally, each of the navigation paths could be customized and captured in a suitable user interface. This aspect of the subject area could be summarized as information management.

A further improvement to knowledge management that can be implemented now is recording conferences and meetings, then putting the resulting files in the same data structure with other project files. To be useful, these files must be readily searchable. Recently, Matt Williamson, one of MAT's contractors who is very knowledgeable on tech gadgets, demonstrated Google's new voice recognition features on an iPhone. Not only are Google already using speaker-independent speech-to-text capability for internet searching, but also are beginning to use speech-to-text tools to make the archive of Youtube videos readily (text) searchable without the need for manual keyword entry. Audio capture with speech-to-text conversion will be a very powerful tool for interfacing knowledge from humans, who are very comfortable in discussing projects, history and why things were done in a particular way. In the near future, it will be feasible to extract rules embodying the knowledge from the captured speech. This will be another important step on the path to artificial intelligence (AI). The text-searching capability will make it convenient to find segments by audio pattern matching (e.g., "please find the segment where Mike asked about 33 ohm resistors on the telecon last month.") Google's roadmap to the future started with text pattern matching and now includes audio pattern matching. It can be anticipated that Google's business plan includes picture pattern matching, then video pattern matching and ultimately artificial intelligence. A key aspect of AI is pattern recognition and matching.

One state-of-the-art practice at MAT is scanning scientific papers and documents using a Canon MP-780 all-in-one fax/copier/printer/scanner. The software provided with the MP-780 generates pdf files having embedded text generated via optical character recognition (OCR). The resulting files are text-searchable, using either the built-in Windows file search capability (slow) or Google desktop tools (fast). The files and Google search readily can be served over a network. This approach is very powerful and clearly a useful approach to project information management, particularly dealing with legacy paper. Packing lists are routinely scanned. In the past, MAT have written many comprehensive project reports, which already are computer files. The use of Google tols makes the information readily available via keyword search. MAT also has several internal Wiki pages. One is used for the routine task of tracking lunch orders. Another is used for experimenting with knowledge capture.

Commercial and open-source implementations

Collaboration tools can be divided into two areas, though not very cleanly; they are on-line, or real-time, and off-line. Some packages may include both types. On-line tools include videoconferencing and desktop sharing, which allows participants at different locations to access and share any information that can be displayed on a computer screen or in front of a video camera. The off-line aspect of collaboration tools is routine information sharing, which includes conference and meeting scheduling, tracking of project information and milestones, and access to project files. It is quite likely that fuel prices will move to record highs after the current worldwide recession. Combined with dropping bandwidth costs, this will drive significantly more business to videoconferencing.

A number of real-time collaboration tools have been on the market for several years to a decade. One of the earliest was Microsoft Netmeeting, which supported desktop videoconferencing with screen sharing at least as early as 1998. Desktop videoconference tools have become much more popular in recent years. Webex was acquired by Cisco in 2007 after achieving roughly 2/3 market share. Competitors in this space include Infinite Conferencing, Vidyo and Hewlett-Packard, and many others. Simple and free tools include the various instant messenger offerings, such as MSN Messenger, which support video, audio and text, but not screen sharing. It seems reasonable to guess that real-time tools will be integrated as part of the overall collaboration toolset, which will facilitate data capture and archiving of conference sessions. Gmail chat already allows archiving of sessions.

A wiki is a system of essentially blank web pages with a simple editor that allows users to create content and hyperlinks between pages. It satisfies many of the criteria described above, including the potential to embed knowledge and flexible navigation. The pages can include links to drawings, schematics, and any other file that are part of a design process. For each different type of navigation required, it is possible to set up a separate wiki page, or set of wiki pages, that provide the navigation. Confluence is a wiki-based tool, which is one of the simplest solutions to the problem of knowledge capture and information access. Confluence is offered by Atlassian (San Francisco, CA) and can be installed locally or run on their servers.

Another tool that combines information, project and knowledge management is product lifecycle management (PLM) software. Generally, PLM products also are advertised as collaboration tools. In a sense, this is a tautology, because almost all product development is collaborative. Examples include Teamcenter (Siemens), CATIA (Dassault Systemes), Windchill (Parametric Technologies Corporation) and numerous others. Teamcenter began as an a joint venture enhancement to computer drafting/CAD tools and was eventually acquired by Siemens. Teamcenter is used by several major automobile manufacturers, while CATIA is used by Boeing and Ford Motor, among others. The scale and expense of these enterprise-scale solutions may be inappropriate for small projects and small companies. Clearly this space is highly competitive, so it can be anticipated that these tools will continue to mature. These packages includes a suite of tools addressing most, if not all, aspects of project management.

Redmine (www.redmine.org) is a rapidly maturing free and open-source (FOSS) project that combines wikis, roadmaps, bug tracking, issue tracking, feature requests and forums with user data stored in a Source Control Management tool, such as Subversion or Git, documents, and other files. It is slightly programmer oriented, but any small- to medium-sized company can benefit from having all of features packaged into a single service. From an SBIR proposal point of view, a project planning tool that can capture the essence of a project would be very helpful for writing proposals.

Arguably, knowledge management is a database problem in which the knowledge elements are catalogued. The critical component that goes beyond simple database technology is expressing the relationships between the elements. The complexity of the connections is the essence of intelligence.

Conclusions

Significant value can be realized by effective use of knowledge management tools. At present, no clear winner has emerged, particularly for small projects. The market for these tools is not yet mature. One approach to the instant project is to put together inexpensive, off-the-shelf open-source tools, such as wikis and Subversion, or use the Redmine open-source toolset. Combined with Google desktop tools, this approach is very inexpensive and powerful. Clearly, all companies will have to choose standardized toolsets in the near future. Because the investment of project effort scales with the number of projects, larger companies must choose sooner and better.

Aknowledgements
Stefan Natchev for encouragement and for providing information on Redmine
Tom Old and Bob Hertel for providing input on Teamcenter
Sami Nuwayser for general critique and lots of good ideas