Ethics of using company data from internship for academic project

I am a graduate student and one of my research projects requires me to crawl a website using their public API. This is going alright but the amount of data we can collect is limited, since it is a public API. As it happens, I will be doing an internship at the company that owns that website, and will soon have access to an unlimited supply of their data. Is it okay if I use that data for my academic research project, or will they expect me to only use their data for projects relating to my internship?

5,990 25 25 silver badges 41 41 bronze badges asked Feb 22, 2015 at 19:06 111 1 1 silver badge 5 5 bronze badges

You will almost certainly need the company's permission. I'd expect you will have to sign an NDA. It may also be more difficult to publish your results if you can't share your data with reviewers.

Commented Feb 22, 2015 at 19:11

"will they expect me to only use their data for projects relating to my internship?" - Why don't you simply ask them?

Commented Feb 22, 2015 at 19:12 This is not about ethics, really, but about law. Commented Feb 23, 2015 at 7:28

The public data is obviously ok to use for academic research, but do not expose private client data through research publications, regardless of whether or not an NDA is in place. I guarantee that you would be persona non grata in your industry if you do that. You would be de facto blackballed by any company that discovered that you exposed private corporate information of a client. I know for a fact that if I ever found out that you published private data of any company you worked for, you would never work for me; I'd make sure of it.

Commented Feb 23, 2015 at 21:15

You need to ask for permission. Aggregate summarized data may be okay, but proprietary data will probably not be.

Commented Feb 23, 2015 at 21:36

5 Answers 5

You should ask the company where you do your internship. Perhaps they are fine with whatever you want to do. Perhaps they will allow you to use their data but only release summaries, not the raw data. Perhaps they will require you to anonymize it.

In addition, how will you do the analysis? Will you take the company's data and transfer it to your own computer? (This may well not be welcomed by the company.) Or do you plan on using a company-supplied machine to do your analyses? (Same problem - companies don't necessarily like it if you use their hardware for what are essentially personal tasks.) Do you want to install third party software to analyze your data on their machines?

Think through what you want to do, then go to your supervisor. Depending on their personality, people may even be interested in working with you on this, perhaps get a publication out of it.

Whatever you do, don't, don't, don't just use the data without asking. Whether or not you sign an NDA.

answered Feb 22, 2015 at 19:18 Stephan Kolassa Stephan Kolassa 37k 11 11 gold badges 123 123 silver badges 193 193 bronze badges Strong, strong emphasis on "don't". It could get you in to serious trouble. Commented Nov 16, 2018 at 1:51

Ask them and see what they have to say. It's one thing to crawl their public api in your own time ( which actually may specifically prevent you from doing your research according to its terms of use ) but using their internal data directly without asking would be unethical at best and potentially illegal at worst. ( and the last thing that you would want is to be known as the guy that stole people's personal information for their own gain then got dragged through court when they made it public) while your research may only contain aggregate data or anonymised datasets then that still won't change the perception of potential future employers.

answered Feb 22, 2015 at 22:50 Damian Nikodem Damian Nikodem 387 2 2 silver badges 8 8 bronze badges

[*] UPDATE: nightmares like the following (a retrospective IP policy) are precisely the sort of thing that can happen: "PhD student, issued contract at year 3 which will sign over intellectual property. Is it legal?". Now that person can have all their publications vetoed and their thesis closed by the industrial partner.

answered Feb 23, 2015 at 4:46 1,949 14 14 silver badges 21 21 bronze badges

@StephanKolassa: hell yeah. What if a couple of years later, the original managers you dealt with has changed, or you want to do derivative or follow-on work, or give your co-researchers access, or the company simply change their mind, or deprioritize the collaboration, or your contact leaves. Let alone getting into disagreements about IP, and who does or does not own what, or did or did not discover what. All these things can happen.

Commented Feb 23, 2015 at 10:12

The only danger is that if you want something in writing, the lawyers may get called in. In which case, everything will get much more complicated and take much longer.

Commented Feb 23, 2015 at 10:14

@StephanKolassa: yes, sure. You have to develop a sense for what is achievable, and secure that in writing, upfront, fairly quick. Often it's simply not worth defining IP rights, or they may already be covered by a frame agreement (or verbal understanding) between the university and the company. Company cultures on all this differ immensely. Often, identifying the person/set of people in the company who is both incentivized to say yes and has the authority to, is a large part of it. And sometimes educational in itself. Or eye-opening.

Commented Feb 23, 2015 at 10:20

@StephanKolassa: the nightmare scenario is you don't want to do a ton of work then have a publication embargoed or held up by the partner.

Commented Feb 23, 2015 at 10:21

I will be doing an internship at the company that owns that website, and will soon have access to an unlimited supply of their data. Is it okay if I use that data for my academic research project

No with the same strength as the answer to the question, "I'm doing research on wages in different professions. A friend has invited me round for dinner next week. Is it OK if I rifle through his filing cabinet to see if I can find a payslip?"

If you want to use somebody's data and that data is not unambiguously public, you get their explicit permission, first. In a corporate environment, there are many reasons why permission may be refused: for example, the data may be commercially sensitive or the company might have privacy obligations towards its customers. The fact that you need privileged access to get hold of the data should tell you immediately that you need permission to use it for anything other than the specific reason for which you were granted that access.

answered Feb 23, 2015 at 21:25 David Richerby David Richerby 34k 6 6 gold badges 75 75 silver badges 145 145 bronze badges

What you are talking about doing is technically considered corporate espionage without explicit permission from the client.

Facts (as you have stated them):

  1. You are performing academic research consuming public data of Corporation A
  2. You will be taking on an internship with Corp A and have access to Corp A's private information
  3. Your capacity within Corp A is disjoint from your academic research involving Corp A.
  4. You are at least entertaining the idea of using Corp A's private data for research purposes without their consent.

So my recommendation is entirely dependent upon the nature of the relationship between your internship with Corporation A and their knowledge of your research project involving their data.

If they knew about your research project before granting you an internship, I see it as entirely appropriate to ask them for permission.

If they did not know about your research project before granting you an internship, you put your internship at risk by asking for permission to use their private data, depending on the sensitivity of the data you would have access to and the personalities of the business members in control of your relationship with the corporation. The risks associated with you violating their denial of your request could be more trouble than you are worth.

If they did not know about your research project before granting you an internship and you publish private corporate data without explicit consent, they have a legitimate claim that you sought your internship with them for the express purpose of gaining dubious access to their private information and you would be in for a world of hurt both legally and professionally. Depending on the fallout of that, you could reasonably expect to find yourself to never be employable again in your field and you could reasonably expect to find yourself in prison depending on the nature of the data that you expose and the political clout that is held by the people that you piss off.

Edit To Add

You may already be between Scylla and Charybdis on this. Look at the potential optics: you have a research project (the nature of which you have not disclosed.) You also have an internship with access to corporate private data of one of the subjects of your research study. Assuming that you do not announce your research project and you take the internship, your standing both within your academic community and within the corporate world will definitely hinge on the nature and results of your research project.

If you take an internship with them and then crucify them in your research project, their quid pro quo will be reciprocal regardless of whether or not they consent to you publishing their private data. With access to their private data, your research methodology will be called into question (perhaps legally) and any potential mistake you may have made will be exacerbated. If your results are flattering to Corporation A, then your academic peers may consider you to be politically beholden to Corporation A.

You may already be in a no win scenario other than to cancel your research project or cancel your internship. You really are not supposed to do what you are thinking about doing.