Tuesday, 19 March 2019

Points for Evaluators to Consider When Destroying Data

Evaluators ability to obtain program and/or client data is based on trust.  Not only are evaluators expected to interpret the data properly, they also are required to manage client privacy responsibly.  Every so often, there are stories in the media about data leaking into the wrong hands.  No evaluator wants to be one of those stories.

One key aspect in  inspiring confidence in the use of the data is to be accountable for what happens to it.  This means at the end of the study, the data should be destroyed.  An initial reaction is that the Evaluator just has to delete the data files.  But is it that simple?

Actually, there are several points to be considered and although not complete, this article gives the general flavour.  First, what actually happens when a file is deleted?  Actually, no more than is what is obvious.  The file no longer appears on the list of files for the directory.  A skillful hacker who can get around the computer's operating system will find the data on the hard disk untouched.  Technically, the data would remain on the disk until another file is written over the data.

The solution to this will depend on the version of the operating system that you have installed.  All mature operating systems will have some variant on the traditional delete function that will overwrite the existing data so that it no longer exists.  Still this is not enough for some.  Although I know of no actual cases in program evaluation, it is considered a real possibility that when the data is being overwritten that old data will be missed and still be present on the disk.  Commercial products exist to perform a very thorough wipe of a disk to eliminate the possibility.

It is just not the operating system that may leave stray copies of the data, the statistical software may create data in intermediate forms that may be in unusual formats.  R is a good example of this as analysts are encouraged to collate the data in refined file formats, which may be more revealing of the personal data than the original.

There is another potential danger that is very real in the world of program evaluation when reproducibility of results is an issue.  This derives from the necessity to be able to demonstrate how results are achieved and that they are valid.  There is no way to achieve this without maintaining the data for certain period as specified in the privacy agreements.  However to be able to achieve this, the organization that the evaluator works in must have an effective form of archive management in place, so that the organization can be assured that the data is deleted, even if the evaluator has left the job.

What is the moral of the story?  Before you request data, make sure you have a plan in place.  Data management is not as simply as it looks.

Hard Drive being physically destroyed  



Wednesday, 30 January 2019

A Day in the Life of Internet Site Manager

As part of my consulting practice, I have a small Internet site, HenskyConsulting.com.  This is something that is almost normal for a small independent consultant.  In general, once the site is setup there is very little work to maintaining it.  Unfortunately, from time to time, you have to deal with issues, and this blog is about one of those issues.

When the Internet was first conceived, it was built on a naive sense of trust.  There was a naive idea that the people who were smart enough to use the Internet were also ethical enough to use it for higher purposes.  As a result, there was a great explosion of knowledge.  The Internet also became easier to use and there was a great explosion of users of all kinds.

Could this virtual Woodstock last forever?  Not surprisingly, many forms of malicious behaviour have arisen.  As a result, the managers of Internet sites, must attempt to keep their sites safe.  This involves all kinds of activities, some of which you would not want to document on a blog, but some of which are worth sharing in a short blog.  For me, the activity of the day is email spoofing. 

"Email spoofing" is a specific activity by which a nasty individual can send out an email that appears to be in your name.  They do this by creating a virtual email server at emulates the real server to the point where the from fields appear identical to a real email.  If one were to receive such an email on a smartphone, it would be nearly impossible to know that it is a fake.  If the email is viewed on a desktop computer, there are telltale signs that the source is fraudulent.

What does this mean to the site manager who has created the email account?  Basically, you are on your own, if the individual is foreign based.  Your local police authorities are not equipped to deal with this.  You should report it to ICANN but that will not help in the short run. The ISP who hosts your web service is the first point of reference.  Essentially, when you request an email address to be created, you need to understand the various parameters.

Why would creating an email address be complicated?  Basically, when the Internet was created, a lot of flexibility was granted out of that naive sense of trust.  Generally, the default email accounts, allow for a high degree of flexibility as noone wants to lose time when configuring their email setup.  This is only possible if the security is relaxed.

Relaxed security in creating an email account means that it is easy to connect multiple devices to your account such as Smart phones, laptops, as well as your main work computer.  The down side of this is that it become relatively easy for someone to pretend that they are one of your multiple devices.  The prime way to tighten up your security is to go to your ISP, and follow the best practices that they have listed. 

In the case of email spoofing, you can look at the email "source" with your email software.  You will likely see a trail that is very different than what you would have seen if you would have seen with a real email sent with one of your devices.  If you check the ip addresses listed, they will likely not be listed.  The domain server will also likely not be one used by your Internet Provider but a virtual one such as EXIM (ironically created at Cambridge).  If you tighten up the SPF record that governs who uses your email address, this should solve the problem.  If not, you will have to bite the bullet and shut down the email address.

If this sounds frustrating you are right.  I am dealing with this issue right now.  If it is not just bluff, there may well be embarrassing emails going out from my account.  Still it is the price we pay for our freedom on the Internet.  Do not be surprised if this note is updated, as I learn more in the near future.

A portion of the email source trailer.  Note I XXX out my personal information.  The IP address is that of the spoofer.