FAQs
FAQs
Joining CORE
Indexing
Membership
Dashboard
Services
General information
REF2029 Audit
Joining CORE as data provider
Data Provider’s Guide
This guide explains to repository managers how to configure their systems for successful indexing by CORE.
Go to the guide
Is my repository or journal indexed by CORE?
To check whether your repository or journal is indexed
by CORE, go to our data providers
list
How do I register my repository or journal with CORE?
CORE uses information from various registries, such as
OpenDOAR
and
DOAJ
to include new repositories and journals into CORE.
If your repository or journal is already registered with
some authoritative registry, you don't need to do anything.
If your repository or journal has not been registered
yet use
the form
to add it.
Where are the repositories indexed by CORE located?
CORE is an international service and indexes repositories from
various locations around the world. This information is displayed in
a map at our
data providers page
I am an author of scientific papers and I would like to include
them in CORE.
CORE is a indexing service and is not similar to research
networking sites, e.g. ResearchGate or Academia.edu, where authors
can deposit papers, so please do not email us the full text of your
papers. If you have deposited your articles in a repository
let us know
the name of the repository.
There are chances that we index it already and if we don’t we could
start indexing it.
CORE indexes from DOAJ but I cannot find my journal which is already
registered in DOAJ.
CORE indexes DOAJ as a single entry, which means that each journal
title does not appear separately in CORE. If you wish to have a
separate entry for your journal in CORE, do
send us
the
journal's OAI base URL and we will create a new entry.
Indexing
General
How often does CORE index the repositories?
CORE does not index all the repositories that exist in our
database with the same frequency. Repositories are indexed
as frequently as our HW infrastructure allows.
The specific time of indexing for a repository is determined
by the CORE Scheduler. The CORE Scheduler is a software component
that ensures that our indexing cluster of machines is close to
fully utilised 24/7 for 365 days every year. As soon as some
resource is freed, the CORE Scheduler decides which repository
needs to be indexed based on several criteria. These criteria
include, but are not limited to, the previous time of the repository
being indexed, the size of the repository,
the location of the repository, the repository's indexing
performance and information about potential previous indexing
errors. We review the functionality of the scheduler on a regular
basis to ensure that its decisions on what to index next maximise
the number of ingested documents over a unit of time.
If you have a question regarding a specific
repository do
get in touch
with us.
When will the indexing process of my repository be completed?
Depending on the size of the repository and the existing traffic
in CORE's servers, the indexing can last from a couple of hours
to a couple of weeks. If we experience any technical issues
during that period, we will get in touch.
What types of scientific outputs does CORE provide?
CORE indexes all metadata records in a repository, but it is in
position to index full text records in PDF only. We are working
though to include other file types, such as HTML webpages, images etc.
My repository has plenty of metadata records, but not all of them
have an open access full text. Can my repository be indexed by CORE?
Yes it can, provided that the repository offers its content as
Open Access
My repository contains a much higher number of records than CORE.
What can I do?
Log in to the
CORE Dashboard
and look under the issues tab. Examine whether CORE has the correct oai base
url of your repository or if there are any technical issues listed there.
If there are no technical issues or you do not have an account for the CORE Dashboard
There is a mistake in my article, can I upload a new version?
CORE works at the level of repositories and cannot update specific
records. You can upload the new record into your institutional
repository or journal and CORE will synchronise it at the next
scheduled re-indexing.
There are many new records in my repository. Do I need to notify CORE?
No, CORE follows an automated re-indexing process and your repository
will be re-indexed at the next automated re-indexing.
How is CORE different from Google Scholar?
Google Scholar is a search engine containing scholarly research papers but it is not designed
to collect information from repository and journal systems. More specifically:
Google Scholar crawls and indexes the full text of research papers that can be found on the web,
while CORE indexes also the metadata as supplied by the repository opr journal system.
The audience is different. Even though CORE has a search engine in the same way as Google Scholar,
CORE’s delivers value by making research information machine readable, delivering an open access scholarly
infrastructures which others can build on via the
CORE API
and
Dataset
The metadata in the CORE display page are wrong. How can I have them corrected?
CORE does not create the metadata, but rather indexes
them from its content providers. If the metadata are wrong
then contact the repository or journal where you had originally
deposited or published your content.
Are the research papers offered by CORE peer-reviewed?
Yes it does. CORE indexes content from repositories and journals. The first
do not perform peer review of the deposited content but the latter
do. In some occassions the content deposited in a repository is already
published in a journal and is peer-reviewed. In addition, repositories may contain
grey literature and these resources are not peer-reviewed.
How is the record count in OpenDOAR estimated by CORE?
CORE has created definitions with regards to the statistics it provided to OpenDOAR.
Metadata:
The total number of metadata records with a unique OAI identifier provided
by the repository as this appears in the application profile which CORE index -
if CORE indexes from the RIOXX endpoint, CORE will provide RIOXX counts instead of Dublin Core counts.
Full text:
Count of metadata records - as above - with a least one attachment
provided by the repository being a pdf file, which a) is publicly downloadable
(no-login required or output is not under embargo, etc.) and b) the full text is machine readable,
i.e. it has an extractable text not via OCR.
Who owns the rights of the content in the CORE collection?
CORE does not own any rights of the aggregated content and each
resource has its own license, which should be respected
by the CORE users.
I landed at a CORE URL linking to a full text PDF but the link
gives a 404 error. How can I access this output?
You cannot access it – a 404 error indicates that the full text
has been removed from CORE.
My repository contains a mix of OA and non-OA papers.
Can I ask CORE to limit indexing only to OA papers?
Yes, this is possible.
Please contact us and we will enable this for your repository.
Technical
What are the technical requirements for being aggregated?
In order to realise the data transfer and regular data updates of
CORE and your system, CORE uses a variety of protocols to ingest
the content. The easiest way to get your content integrated with CORE
is the
OAI-PMH protocol
If you wish to join CORE get in
touch
What is my journal’s/repository’s OAI base URL?
OAI base URL looks similar to
or
when homepage URL is
. CORE cannot index the
journal’s/repository’s content via its webpage URL.
If you are not sure whether your journal/repository has an OAI base
URL,
contact our team
and we will provide
technical support to you.
What is an OAI Identifier?
A more technical answer for
Target audience: repository managers, Technical staff.
An OAI Identifier is a unique identifier which distinguishes items in a repository.
It
“unambiguously identifies an
item within a repository; the unique identifier is used in OAI-PMH requests for extracting metadata
from the item”
The Identifier contains 3 parts, split using:
“oai“ : Unique identifier oai. This describes the type of the identifier
“website address”: Where the item is hosted.
“Unique identifier”: An identifier of the object
For example:
oai:eprints.gla.ac.uk:129357
oai:digitalcommons.odu.edu:oaweek-1012
oai:oro.open.ac.uk:75049
oai:dspace.stir.ac.uk:1893/24654
Not all OAI Identifiers look like this, but they are non-standard and their use is discouraged.
OAI Identifiers must follow the
URI (Uniform Resource Identifier)
syntax.
For more information about how OAI Identifiers are formed, visit
Specification and XML Schema for the OAI Identifier Format
An OAI identifier is registered but does not resolve.
Reasons:
-The item has not been indexed by CORE yet
-The repository is not yet registered as a CORE data provider. Become a
provider
How difficult is it to satisfy the CORE indexing recommendations?
We would expect that indexing could take from one hour to a
couple of days for a typical repository. In some repository
systems, such as EPrints, most of these recommendations are
followed by default. Find more details
how it
works
Which metadata formats does CORE support?
We mainly support oai_dc, the mainstream metadata format used in the
OAI-PMH Protocol
utilising the
Dublin Core
vocabulary, a popular vocabulary for bibliographic data.
We also support
RIOXX
, a richer metadata
protocol, used mostly by the UK repositories.
Can CORE just index our content, but not store it on its servers?
To provide its service, it is essential for CORE to be able to store
a cached copy of the indexed content. This is needed to verify open
access sources, offer analytical services, support text and data mining,
recommendation tools, etc. By cashing a copy of the indexed
resource, CORE is not different from many commercial and
non-commercial, academic and non-academic, search engines including
Google or CiteSeerX.
The primary difference from such systems is that CORE caches only
copies of open access content. More information on the benefits of
this approach is available in the “
CORE: Three Access Levels to
Underpin Open Access
article.
How can my repository opt out from being indexed by CORE?
CORE uses information from various registries, such as
OpenDOAR
, to include new repositories,
journals and archives into CORE. If the circumstances have changed in
your repository, you can restrict indexing and crawling activities
by modifying your rules in your “robots.txt” file by using the
Standard for Robots Exclusion
This will also guarantee the content cannot be cached by search
engines and indexing systems. In addition, you could withdraw your
repository from all open access registries lists; when this takes
place, please
notify us
Removing full text or metadata
How does CORE ensure that the indexed content is Open Access?
CORE aggregates content from repositories registered in
OpenDOAR
, journals registered
in
DOAJ
or those content providers that
requested their content to be aggregated. This means that all the
content sources aggregated by CORE must be open access as this is a
requirement for the providers to be included in these registries.
According to the official
BOAI definition of open
access
CORE is allowed to,
"distribute, search, or link to the full
texts of articles, crawl them for indexing, pass them as data
to software, or use them for any other lawful purpose, without
financial, legal, or technical barriers other than those
inseparable from gaining access to the Internet itself. The only
constraint on reproduction and distribution, and the only role for
copyright in this domain is to give authors control over the
integrity of their work and the right to be properly acknowledged
and cited."
CORE indexes copyrighted material from my repository
CORE’s system is fully automated and relies on data made available
in a machine readable form. If your repository hosts full text with
a restrictive license that prohibits indexing, this needs to be
properly communicated in a machine readable form. All non open
access items should be blocked in the robots.txt file. If this
information is provided in the metadata for each record and CORE
exposes the full text, please get in
touch
with us.
How can I retrieve the license information (CC-BY, CC-BY-NC, etc.)
from the indexes outputs?
An output's license is not consistently exposed by content providers in a
machine readable form. In some circumstances it may be possible to extract
this from the "fulltextUrls" field in the
CORE API
. However, this is
subject to the license the data offered by the data provider.
A full text record is deleted from my repository, but is still
available in CORE
CORE’s system is fully automated and relies on data made available
in a machine readable form. Our system understands that the
full text of a record was removed only when the record is marked
as deleted in the metadata of your repository. See
how to take down full text
from CORE in the related FAQ.
How do I notify CORE to take down full text content?
If full text content appears in CORE but not in the
hosting service your repository manager can take it down via the
CORE Repository Dashboard
anytime without notifying us. Alternatively, you can use the
update or remove article form
The full text has been removed, but the metadata is still visible
in CORE
It is a CORE policy to remove only the full text and not
the metadata. Only in limited cases - e.g. when a publication
did not happen - it is possible that the metadata can be removed.
In that case,
email us
I have removed an item from my repository, but the metadata are
still visible in CORE
In order for the metadata to be removed from CORE they need to be
marked as "deleted/removed" in the repository. If the metadata are
marked as "restricted" CORE will still display it.
Membership
Membership documentation
The membership documentation provides detailed information about CORE Members’ benefits, including the description of Members-only functionalities on the CORE Dashboard and beyond.
Go to the guide
What is membership of CORE?
We founded CORE to provide free indexing, and discovery for institutions.
Initially, we were funded to provide these tools, but much of our funding ended in July 2023.
As a result, we have created a membership model, with both free and paid membership.
Our founding goal remains as before: we index any institutional repository.
If you register as a starting member, you can see a freely available dashboard that shows
useful information about your repository. We have added optional paid tools
that assist in analytics and compliance.
Is CORE paid membership worth it?
By joining CORE you are supporting the open research community.
Moreover, we believe that our paid membership provides tools that
save your institution time and money. It enables you to check automatically
what would otherwise have to be done by hand, for example finding copies of
papers by a member of your institution that were deposited in another repository
(what we call the “cress-repository check”). Finally, as a paying member,
we provide anyone from your institution with fast API access to the full
CORE dataset for text and data-mining (TDM) purposes.
Your membership prices look very high for low-income countries
Our aim is to make membership available for the widest range of institutions.
Accordingly, we have set three tiers of membership, using the widely accepted
World Bank criteria
for low- and middle-income countries. Our aim is to make membership available to all,
because the more institutions who join, the better the service for all members:
specifically, we can check other repositories for relevant content to your institution, wherever it appears.
We don’t have enough technical knowledge to use the dashboard. Can you help us?
Many institutions don’t have the staff or the knowledge to manage their repository effectively.
For sustaining members, we provide a free repository health check every year.
We go through all the results from our automatic checks and show you how to fix them.
In other words, we provide a kind of out-sourced technical resource for you.
Do I need to be a member of CORE to use the Recommender tool?
Both the Recommender and the Discovery tools are free of charge for any institution
that provides their data for indexing. They are both benefits of the free starting membership.
We don’t use our repository to track compliance, so why do we need compliance tools?
For many countries, including the United States, institutions may not track compliant
across the institution. However, individual faculties and departments want to ensure
that their publications are comipliant with federal and with funder mandates.
Membership of CORE enables you to identify that a paper has been deposited,
even if it was not deposited in your local repository.
Dashboard
What is the CORE dashboard?
For any institution, we provide a dashboard that enables them to see their
repository from the outside - the way that any external service sees them.
We give you statistics about the number of items indexed, how many of
them are full text, and the proportion of content that has a DOI identifier.
The dashboard is free to access for any institution who signs up for free starting membership.
Why can’t I see my dashboard?
You can only see the dashboard for your organisation if you have signed up as a member.
Becoming a starting member is free of charge for any institution that provides content for CORE to index.
Simply contact the administrator at
[email protected]
to be sent an invitation to open an account.
How does CORE help my institution with compliance?
Open access articles are free to access, but the publisher may not maintain
a freely available copy for researchers to access. It makes sense for researchers
to self-archive their content. The author (or the institution) posts the article in
a repository, typically the institutional repository of the university where they work.
However, in many countries, funding requirements are not only that the article
should be available open access, but that the article should be available within a certain
time period after acceptance. For the UK REF 2029, this time period, known as the deposit delay, was 92 days.
This introduces a reporting requirement for any institution that wants to be compliant with funder mandates:
it is necessary to demonstrate that articles are compliant, that is, available open access and deposited within the correct time frame.
For every institutional repository where CORE indexes content, we try to assist you with compliance,
identifying the date of deposit wherever possible, as well as the date of publication.
How can I track the date of deposit for my content?
The date of deposit is increasingly added as a metadata field when content is
uploaded to repositories.
RIOXX
, for example, is a metadata
protocol that enables metadata to be shared across repositories, particularly
date of deposit metadata. If all institutions make the date of deposit information available,
CORE can provide a service for every repository by finding compliant versions of papers in other repositories.
I have a CRIS system, so why do I need CORE?
Simply put, any CRIS system (Current Research Information System) works for an individual institution.
Although it can often show compliance for papers within that institution,
a CRIS system (or an individual repository) cannot find duplicate copies of articles
that were deposited in another institution, for example if a paper has a co-author
from Sheffield when the main author is based at Leeds. CORE is unique in indexing
repositories from around the world, and can identify duplicate copies of papers.
As a result, CORE and CRIS systems are complements, not substitutes, for each other.
CORE services
How do I register for an API key?
Use the
registration form
to retrieve
your personal access key for the CORE API.
Can I use the CORE API in a project?
If you plan to use the CORE API we kindly ask the following:
attribute CORE by including in your website
this snippet
send us
an email
with a brief summary on
how you are using the CORE API,
grant us permission to present this summary to our funders and/or
display it on our website,
allow us to list your company’s name, url and logo on our
website
Does CORE offer a higher download quota of the CORE API?
Yes, this is possible but there is usually a cost associated with it.
Please
email us
the name of your
company or organisation, business entity, the number of requests you
estimate to send and how often you will send them and we will get
back to your with a quote.
Can I use the CORE API for commercial purposes?
Yes, you can use the CORE API for commercial purposes, (Terms & Conditions) apply.
We provide 30 Day Free Trial for Institution and Enterprise.
We will ask you for your circumstances during the registration process and we might
contact you afterwards to clarify any points and assess your eligibility for a free licence.
If your circumstances change, please let us know.
Rate limits
CORE API is free and does not require registration, subject to our
rate limits
. However, organisations that register get a faster rate that is typically not free. For
Supporting and Sustaining Members
, the faster rate comes as a free member benefit.
Can I use the CORE dataset for commercial purposes?
Please note the dataset has been created from information that was
publicly available on the Internet. Every effort has been made
to ensure this dataset contains only open access content.
We have included only content from repositories and journals that
are listed in registries where the condition for inclusion is the
provision of content under open access compatible license. However,
as metadata are often inconsistent, licensed information is often not
machine readable, and repositories from time to time leak information
that is not open access, we cannot take any responsibility for the
license of the content in the dataset. It is therefore up to the user
of this dataset to ensure that the way in which they use the dataset
does not breach copyright. The dataset is in no way intended for the
purposes of reading the original publications, but for machine
processing only.
How often is the CORE dataset updated?
We aim to generate a new public dataset at least once a year.
If you need a more recent dataset, please
get in touch
with us as we might be able to arrange it.
How can I download the CORE Recommender?
If you have access to the CORE Repositories Dashboard, log into the
CORE Repository Dashboard
and you will get the
instructions on how to download the
CORE Recommender
Otherwise, visit our recommender
registration page
where you will also find the installation instructions. Repository
managers are highly recommended to use the CORE Repositories
Dashboard.
When I look at the papers I have authored, I am not pleased with the
similar papers suggested by the CORE Recommender.
How can I change that?
The CORE recommender uses the popular
content-based filtering
system
The similar resources that appear in the CORE Recommender and their
quality are highly impacted by the metadata information supplied by
the repository of origin. If that information is incorrect or
incomplete, you should contact the repository of origin. To improve
the CORE recommendations, you can use the feedback button, with
which you can remove any undesirable articles.
Does CORE provide a list of PDFs in a specific language?
Unfortunately CORE does not provide any language specific datasets
at the moment. Users can use the CORE API to download individual
PDFs.
What does non-commercial use mean?
Non-commercial means:
The organisation is a registered charity or a not-for-profit AND
the use of the CORE service will not enable, contribute to
or support the use of any paid-for service of the organisation
or of another third party organisation linked to this organisation.
General information about CORE
I need a high resolution logo of CORE. Where can I find it?
a high resolution logo of CORE.
Where can I find CORE’s brochure?
You can find the most recent CORE brochure in
our resources list
Where can I find CORE’s flyer?
Access the
CORE’s flyer
in our resources.
How can I embed the CORE badge on my website?
CORE badges
You can use the below badges on your website to show that your content is indexed by CORE and that you are a part of CORE and Open Research community. Please chose the badges according to your membership tier and include the badges in your system by means of the supplied html tags.







Do you need to cite CORE?
Visit our
research page
CORE guidance on REF2029 Audit
Why should deposit dates be provided by aggregators and not by
single institutions?
Due to cross university collaborations some outputs that could be
considered non-compliant due to a late deposit may be compliant
due to a deposit that was made within the policy timeframe at
another institution or subject repository. Individual institutions
could benefit as they might not be fully aware of all compliant
outputs and might consider some of their outputs non-compliant,
while in fact they are compliant.
How do I ensure that my repository's deposit date matches CORE’s
deposit date?
CORE captures data as explained in the
CORE
recommendations
. By following our recommendations CORE
should have the same deposit dates as the repository.
A metadata record with a full text PDF does not appear in CORE from my
own repository but it has been indexed from another repository. What
if the record was not deposited on time at the other repository. Will
this have an impact with regards to the REF2029 compliance?
CORE can identify deposits of the same articles from across
repositories. By doing so, an output deposited late in repository A
could be technically compliant provided that it was deposited within
the timeframe at repository B, i.e. the earliest deposit date
irrespective of the repository could be used. However, for the time
being, CORE agreed to supply the data to Research England and it will
be up the discretion of Research England to interpret the data. We
understand that the motivation is to mark outputs as non-compliant
only in cases where there is clear evidence that they are truly
non-compliant.
CORE indexes full text PDFs only. Does this mean that outputs with
full text in other formats, e.g. word documents or text files, will
not be considered as REF2029 compliant outputs?
CORE indexes both metadata and full texts, currently only in PDF
format but we will include support for other formats in the future.
While the presence of the full text is preferred, CORE has all
information necessary to support the REF2029 audit as long as the
metadata of your outputs are in CORE. To minimise the possibility of
some of your outputs not being captured by CORE, please follow the
CORE recommendations
When I log into the CORE Repository Dashboard in the “Content” tab I
see a date. Is this date the deposit date that CORE has for the
output?
CORE captures data as explained in the CORE recommendations. By
following our recommendations CORE should have the same deposit date
as the repository. The date exposed in the CORE Repositories Dashboard uses an indexing system that reads the “deposited date” exposed by your
own repository system.
How can I see the deposit date from my repository's REF-able outputs
in CORE?
Deposit dates are available via the
CORE Dashboard
. Repository
managers can access the percentage of papers that are non-compliant,
e.g. outputs that were deposited 92 days or more after publication,
according to the REF 2029 Open Access Policy.
Can I have my repository's deposit dates from CORE in a CSV file?
This is possible via
CORE Dashboard
Repository managers can access the percentage of papers that are
non-compliant, e.g. outputs that were deposited 92 days or more
after publication, according to the REF 2029 Open Access Policy.
The date that a metadata record was created may not be the same with
the date that the full text record was attached to a metadata record.
How does CORE know the date that the full text was attached?
CORE captures data as explained in the
CORE recommendations
How does CORE know that the version of the deposited full text is the
correct and compliant version?
The validation that the deposited full text is the first compliant
version is currently not in the scope of CORE's support for the
REF2029 Audit. Research England might use other alternative methods to
check this.