Why you should avoid copy-pasting

Let's say you're reading some blog article and you find some interesting sentences. You figured it would be nice to have them in Tatoeba so you copy-paste them. It may seem like you're doing nothing wrong, but that's not exactly true.

Copy-pasting sentences to Tatoeba instead of creating your own sentences is generally a bad idea. If you don't know much about licenses, copyright, intellectual property and this kind of stuff, you should really not copy-paste anything at all.

Note that this doesn't apply just strictly to the action of copy and pasting text. It applies to copying in general: copying things you've read or heard. For simplicity, we will use the term "copy-paste" to refer to the action of adding sentences that you didn't create yourself.

Reasons why you should not copy-paste

1. You run the risk to see all your copy-pasted sentences erased from Tatoeba.

The original author can request at any time to have the sentences deleted from Tatoeba and Tatoeba will comply to this request. We acknowledge that it takes effort, time, creativity to produce sentences. If the original author doesn't want to see their content within our corpus, we will respect their choice.

If we consider that what you have copy-pasted feels like outright theft, we will delete your copy-pasted sentences even without any prior request from the author(s).

2. You run the risk of wasting other people's time.

When we delete copy-pasted sentences, we will also remove all the translations. So everyone who translated those sentences that you copy-pasted may have worked for nothing.

3. It can affect any project that reuses Tatoeba's data.

More and more projects are reusing our data. Whatever you add into Tatoeba will definitely not just stay within Tatoeba. The data will be spread in many other places.

If you copy-pasted sentences that you shouldn't have, this won't be just our problem but a problem for these other projects too.

4. There can be legal consequences.

We haven't had any legal issue so far, but as our corpus grows, it could someday happen. It's mostly a matter of scale.

  • If you only copy-pasted a few sentences from a certain source, it could go totally unnoticed, or the author could consider it is not worth taking any legal actions.

  • If however we end up having a significant amount of content copy-pasted from a certain source, the authors could possibly sue us and/or sue you.

When can you copy-paste?

1. When the author has explicitly agreed

Never assume it's fine to copy-paste. Be respectful of other people's work.

You can copy-paste when you know for certainty that the author of the content is fine with it. More specifically: they must agree to have their content incorporated into Tatoeba's corpus, knowing that Tatoeba redistributes its data.

This applies for any type of content where language is used: a song, a film, a book, a news article, a FAQ for a software, a podcast...

Content creators will often state somewhere the license of their work.

  • The license will tell you under which conditions they agree for their work to be reused (if at all).
  • From the license you should be able to conclude whether it's okay or not to copy-paste their content.
  • If there is no license or if the license is unclear to you, you will have to contact them directly and ask.

Note that it is better if they publicly state the conditions of reuse of their work via a license, rather than them just telling you in a private exchange.

2. When the content is in the public domain

There is of course the case where the authors are long gone, in which case you can no longer contact them. This is where the notion of public domain comes in.

This notion varies from a country to another and may also not exist in some countries but most societies have agreed that some time after the death of the author, any intellectual work can be reused for anything without any permission.

How long exactly? It depends on the country. You would need to inform yourself about it.

3. Other cases

As far as we know, there are no other cases where it is safe for you to copy-paste. We are however not specialists.

If you disagree with our policies and believe there are other cases where you should be allowed to copy-paste, please let us know before actually copy-pasting.


Article available in: