mediawiki-event-schemas/2.yaml at master · wikimedia/mediawiki-event-schemas · GitHub
5 captures
19 Nov 2018 - 27 Jan 2024
Jun
JUL
Aug
11
2018
2019
2020
success
fail
About this capture
COLLECTED BY
Organization:
Archive Team
Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.
History is littered with hundreds of conflicts over the future of a community, group, location or business that were "resolved" when one of the parties stepped ahead and destroyed what was there. With the original point of contention destroyed, the debates would fall to the wayside. Archive Team believes that by duplicated condemned data, the conversation and debate can continue, as well as the richness and insight gained by keeping the materials. Our projects have ranged in size from a single volunteer downloading the data to a small-but-critical site, to over 100 volunteers stepping forward to acquire terabytes of user-created data to save for future generations.
The main site for Archive Team is at
archiveteam.org
and contains up to the date information on various projects, manifestos, plans and walkthroughs.
This collection contains the output of many Archive Team projects, both ongoing and completed. Thanks to the generous providing of disk space by the Internet Archive, multi-terabyte datasets can be made available, as well as in use by the
Wayback Machine
, providing a path back to lost websites and work.
Our collection has grown to the point of having sub-collections for the type of data we acquire. If you are seeking to browse the contents of these collections, the Wayback Machine is the best first stop. Otherwise, you are free to dig into the stacks to see what you may find.
The Archive Team Panic Downloads
are full pulldowns of currently extant websites, meant to serve as emergency backups for needed sites that are in danger of closing, or which will be missed dearly if suddenly lost due to hard drive crashes or server failures.
Collection:
ArchiveBot: The Archive Team Crowdsourced Crawler
ArchiveBot is an IRC bot designed to automate the archival of smaller websites (e.g. up to a few hundred thousand URLs). You give it a URL to start at, and it grabs all content under that URL, records it in a WARC, and then uploads that WARC to ArchiveTeam servers for eventual injection into the Internet Archive (or other archive sites).
To use ArchiveBot, drop by #archivebot on EFNet. To interact with ArchiveBot, you issue commands by typing it into the channel. Note you will need channel operator permissions in order to issue archiving jobs. The dashboard shows the sites being downloaded currently.
There is a dashboard running for the archivebot process at
ArchiveBot's source code can be found at
TIMESTAMPS
The Wayback Machine - https://web.archive.org/web/20190711150617/https://github.com/wikimedia/mediawiki-event-schemas/blob/master/jsonschema/mediawiki/page/move/2.yaml
Skip to content
Please note that GitHub no longer supports your web browser.
We recommend upgrading to the latest
Google Chrome
or
Firefox
Watch
19
Star
Fork
wikimedia
mediawiki-event-schemas
Permalink
Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.
Branch:
master
Find file
Copy path
mediawiki-event-schemas
jsonschema
mediawiki
page
move
2.yaml
Find file
Copy path
Fetching contributors…
Cannot retrieve contributors at this time.
Cannot retrieve contributors at this time
192 lines (177 sloc)
5.57 KB
Raw
Blame
History
title
mediawiki/page/move
description
Represents a MW Page Move event.
$schema
type
object
properties
## Meta data object. All events schemas should have this.
meta
type
object
properties
topic
description
The queue topic name this message belongs to.
type
string
schema_uri
description
The URI identifying the jsonschema for this event. This may be just
a short uri containing only the name and revision at the end of the
URI path. e.g. schema_name/12345 is acceptable. This field
is not required.
type
string
uri
description
The unique URI identifying the event.
type
string
format
uri
request_id
description
The unique ID of the request that caused the event.
type
string
id
description
The unique ID of this event; should match the dt field.
type
string
pattern
^[a-fA-F0-9]{8}(-[a-fA-F0-9]{4}){3}-[a-fA-F0-9]{12}$
dt
description
The time stamp of the event, in ISO8601 format.
type
string
format
date-time
domain
description
The domain the event pertains to.
type
string
required
topic
uri
id
dt
domain
## Mediawiki entity fields. All Mediawiki entity events should have these.
database
description
The name of the wiki database this event belongs to.
type
string
performer
description
Represents the user that performed this change.
type
object
properties
user_id
description
The user id that performed this change. This is optional, and
will not be present for anonymous users.
type
integer
user_text
description
The text representation of the user that performed this change.
type
string
user_groups
description
A list of the groups this user belongs to. E.g. bot, sysop etc.
type
array
items
type
string
user_is_bot
description
True if this user is considered to be a bot. This is checked
via the $user->isBot() method, which considers both user_groups
and user permissions.
type
boolean
user_registration_dt
description
The datetime of the user account registration.
Not present for anonymous users or if missing in the MW database.
type
string
format
date-time
user_edit_count
description
The number of edits this user has made at the time this revision is created.
Not present for anonymous users.
type
integer
minimum
required
user_text
user_groups
user_is_bot
comment
description
The comment left by the user that performed this change.
type
string
## Since mediawiki.page-move v2
parsedcomment
description
The comment left by the user that performed this change
parsed into simple HTML. Optional
type
string
## page entity fields - all page related events should have these.
page_id
description
The page ID of the moved page.
type
integer
minimum
page_title
description
The normalized title of the page.
type
string
page_namespace
description
The namespace ID this page belongs to.
type
integer
page_is_redirect
description
True if this page is currently a redirect page. This
fact is ultimately represented by revision content containing
redirect wikitext. If rev_id's content has redirect wikitext,
then this page is a redirect. Note that this state is also
stored on the Mediawiki page table.
type
boolean
rev_id
description
The new head revision created during this page move.
type
integer
minimum
## page move specific fields.
prior_state
description
The prior state of the entity before this event. If a top level entity
field is not present in this object, then its value has not changed
since the prior event.
type
object
properties
page_title
description
The normalized title of this page before this event.
type
string
page_namespace
description
The namespace ID this page belonged to before this event.
type
integer
rev_id
description
The head revision of this page before this event.
type
integer
minimum
required
page_title
page_namespace
rev_id
new_redirect_page
description
Information about the new redirect page auto-created
at the old title as a result of this page move.
This field is optional and will be absent if no redirect
page was created.
type
object
properties
page_id
description
The page ID of the newly created redirect page.
type
integer
minimum
page_title
description
This will be the same as prior_state.page_title.
type
string
page_namespace
description
This will be the same as prior_state.page_namespace.
type
integer
rev_id
description
The revision created for the newly created redirect page.
type
integer
minimum
required
page_id
page_title
page_namespace
rev_id
required
meta
database
performer
page_id
page_title
page_namespace
page_is_redirect
rev_id
prior_state
Copy lines
Copy permalink
View git blame
GitHub
, Inc.
Security
Status
Help
Contact GitHub
Pricing
API
Training
Blog
About
You can’t perform that action at this time.
You signed in with another tab or window.
Reload
to refresh your session.
You signed out in another tab or window.
Reload
to refresh your session.