Understanding Data Repositories
A data repository can be many things. Just try to find a consistent definition for one online and what you’ll end up with is a new meaning for every site you try. The fact of the matter is that term data repository means something slightly different to everyone that uses the term. However, most people will generally agree that a data repository is a centralized data storage location for information that an organization is maintaining as part of an organization’s knowledge base. Using data mining techniques, the organization can probe this knowledge base and actually create new knowledge from it. Of course, there are many other implications of data repositories, but the bottom line is that you’re talking about a lot of data in most cases, some of it maintained, some of it stored for historical reasons. Keeping data of this sort safe is a big job.
Data repositories abound. You can find a host of open data repositories on sites such as Open Access Directory (OAD) (http://oad.simmons.edu/oadwiki/Data_repositories), as shown in Figure 3-2. You might actually use some of these repositories in your application. So, security isn’t simply a matter of keeping your private repository safe, but also ensuring that any public repository you use is also safe. After all, a hacker doesn’t really care too much about the means used to access your application and ultimately your organization—all that matters is that the access happens.