⚓ T232237 VE mobile default test buckets

⚓ T232237 VE mobile default test buckets are unbalanced
Page Menu
Phabricator
Create Task
Maniphest
T232237
VE mobile default test buckets are unbalanced
Closed, Resolved
Public
Actions
Edit Task
Edit Related Tasks...
Create Subtask
Edit Parent Tasks
Edit Subtasks
Merge Duplicates In
Close As Duplicate
Edit Related Objects...
Edit Commits
Edit Mocks
Mute Notifications
Protect as security issue
Assigned To
matmarex
Authored By
ppelberg
Sep 7 2019, 1:52 AM
2019-09-07 01:52:13 (UTC+0)
Tags
VisualEditor-MediaWiki-Mobile
(Backlog)
VisualEditor (Current work)
(Product owner review)
MW-1.35-notes (1.35.0-wmf.1; 2019-10-08)
Product-Analytics (Kanban)
(Doing)
Referenced Files
F30769046: image.png
Oct 16 2019, 9:09 PM
2019-10-16 21:09:34 (UTC+0)
F30386198: image.png
Sep 18 2019, 3:06 AM
2019-09-18 03:06:40 (UTC+0)
F30386194: image.png
Sep 18 2019, 3:06 AM
2019-09-18 03:06:40 (UTC+0)
Subscribers
DannyS712
DLynch
Esanders
Jdlrobson
JTannerWMF
kzimmerman
marcella
View All 12 Subscribers
Description
In
T229426#5468481
, we uncovered,
"...53.4% of the users (both registered and anonymous) who were bucketed ended up in wikitext default bucket. It turns out that it would be
incredibly
unlikely (p << 10^-15) to get an imbalance this big if our random assignment was actually 50%–50%. So there's clearly a serious issue somewhere that we need to understand."
bucket
users
source default
1,302,187
visual default
1,214,917
"Done"
This ticket is intended to represent the work of trying to understand the following:
What might be causing contributors not being assigned to test buckets in a more balanced way?
Contributors' bucket assignments were being recorded
after
the editor finished loading. This means in situations where contributors abandon their edits before the editor loads, no information about them (including their bucket assignment) is stored. See:
T232237#5501940
What are the implications of contributors not being assigned to test buckets in a more balanced way? (e.g. Are the results, and by extension the test, valid?)
There is a percentage of contributors' editing behavior we do not have access to. This makes further analysis more difficult considering we'd need to exclude them from this analysis or make assumptions about their behavior. See:
T232237#5520035
Details
Related Changes in Gerrit:
Subject
Repo
Branch
Lines +/-
Load editor EventLogging code earlier
mediawiki/extensions/MobileFrontend
master
+36
-38
Customize query in gerrit
Related Objects
Search...
Task Graph
Mentions
Status
Subtype
Assigned
Task
Resolved
None
T255327
[Epic] Evaluate which editing interface should be shown by default
Resolved
None
T227338
Test visual editor as the default mobile editor on select wikis
Resolved
ppelberg
T232175
Investigate VE as default A/B test findings
Resolved
matmarex
T232237
VE mobile default test buckets are unbalanced
Mentioned In
T226164: Invalid MobileWebSearch events being logged
T223339: Re-run metrics from VE on mobile report
T235104: QA for EventLogging patch
T235101: Rerun mobile VE as default A/B test
T234277: Decide how we move forward with the A/B test
Mentioned Here
T235104: QA for EventLogging patch
T221198: VE mobile default: analyze A/B test results
T229426: Check on VE as default A/B test results
Event Timeline
ppelberg
created this task.
Sep 7 2019, 1:52 AM
2019-09-07 01:52:13 (UTC+0)
ppelberg
added a parent task:
T232175: Investigate VE as default A/B test findings
ppelberg
updated the task description.
(Show Details)
Sep 7 2019, 1:59 AM
2019-09-07 01:59:31 (UTC+0)
DLynch
subscribed.
Sep 7 2019, 8:33 PM
2019-09-07 20:33:32 (UTC+0)
Comment Actions
For the code-side generation of this distribution, this is where it's defined:
if ( $user->isAnon() ) {
$cookie = $context->getRequest()->getCookie( 'MFDefaultEditorABToken' );
if ( !$cookie ) {
$cookie = MWCryptRand::generateHex( 32 );
$context->getRequest()->response()->setCookie(
'MFDefaultEditorABToken', $cookie, time() + ( 90 * 86400 )
);
$vars['wgMFSchemaEditAttemptStepAnonymousUserId'] = $cookie;
$anonid = base_convert( substr( $cookie, 0, 8 ), 16, 10 );
$defaultEditor = $anonid % 2 === 0 ? 'source' : 'visual';
$vars['wgMFSchemaEditAttemptStepBucket'] = 'default-' . $defaultEditor;
} elseif ( $user->getEditCount() <= 100 ) {
$defaultEditor = $user->getId() % 2 === 0 ? 'source' : 'visual';
$vars['wgMFSchemaEditAttemptStepBucket'] = 'default-' . $defaultEditor;
For anonymous users, a persistent ID is generated via
MWCryptRand::generateHex( 32 )
. This is the same method used to generate the sessionid for desktop wikieditor. It's then converted to a decimal number via that
base_convert
call, using the same method as EventLogging's
sessionInSample
For logged in users, their userid is used.
The ID generated is then used to decide which editor they're assigned. If it was even they get the source editor, odd they get the visual editor.
So, question that naturally arises from this: is the imbalance present in anonymous
and
logged in users?
If it's just one, there might be something in the assignment which is skewing things. If both, perhaps it's in the recording.
nshahquinn-wmf
renamed this task from
Why are the numbers of contributors in each test bucket not more balanced?
to
VE mobile default test buckets are unbalanced
Sep 9 2019, 8:57 AM
2019-09-09 08:57:27 (UTC+0)
MNeisler
added a comment.
Sep 9 2019, 11:09 PM
2019-09-09 23:09:59 (UTC+0)
Comment Actions
I discussed this task with
@Neil_P._Quinn_WMF
today. See current proposed plan and thoughts below and let me know if you have any other suggestions.
What might be causing contributors not being assigned to test buckets in a more balanced way?
As a first step, I'll investigate if the imbalance is present for any particular dimension. I'll first look at anonymous and logged in users and then look into other potential dimensions such as browser and country). Since the sample size is large, we should expect the buckets to be balanced across most dimensions. Isolating the imbalance issue to a specific dimension (if we are able to identify one) should help inform where the distribution error might be occurring and next steps.
What are the implications of contributors not being assigned to test buckets in a more balanced way? (e.g. Are the results, and by extension the test, valid?)
This will depend on the issue that causes the imbalance and whether it resulted in a random or non-random assignment of users.
If the identified issue is that a random subset of users did not get placed in the correct bucket (for example, if the bucket wasn't properly stored for a random ~5% of people put into the visual default bucket), then the average results reported are still valid because the assignment is still random.
If the issue we identify is somehow causing a non-random assignment (for example, users with longer load times or connection issues are not stored properly), then the results of our test will be impacted.
JTannerWMF
edited projects, added
VisualEditor (Current work)
; removed
VisualEditor
Sep 10 2019, 3:26 PM
2019-09-10 15:26:48 (UTC+0)
JTannerWMF
moved this task from
Incoming
to
Design and Product Analytics review
on the
VisualEditor (Current work)
board.
kzimmerman
subscribed.
Sep 10 2019, 9:27 PM
2019-09-10 21:27:39 (UTC+0)
Comment Actions
If the issue we identify is somehow causing a non-random assignment (for example, users with longer load times or connection issues are not stored properly), then the results of our test will be impacted.
Just reviewed this with Megan, and this is worth checking ahead of the analysis, but it shouldn't greatly impact timelines (e.g. it probably won't be necessary to run the whole test again).
If there's a skew in the assignment, that could be handled in the analysis itself (it may make the analysis slightly more tricky as a result, but it's still doable). If a particular subgroup ended up being heavily impacted (e.g. 100% were in only one condition), the subgroup should be excluded from the analysis and it might be worth doing a follow-up with that subgroup.
Megan is wrapping up something for Web this week, but will be able to start the most pressing Editing task starting on Friday (whether it's this one or another).
ppelberg
added a comment.
Sep 11 2019, 2:05 PM
2019-09-11 14:05:14 (UTC+0)
Comment Actions
Thank you for investigating this proactively,
@DLynch
and
@MNeisler
@kzimmerman
for laying out the potential implications and our steps forward (summarized below).
A question RE other potential "dimensions":
@MNeisler
, besides browser and country, can you think of what other – if any – variables are knowable to us and could impact how a contributor is assigned to a test bucket?
I ask this in the context of thinking through the other potential investigations we could do after first looking at logged in vs. IP contributors. cc
@DLynch
One thought that comes to mind:
Might there have been a specific time window during the test when contributors began being unevenly assigned? Thinking: maybe we released something else that unknowingly impacted how contributors get/got assigned.
Paths forward
Scenario
Action
If we find a
non-random
subset of contributors did not get assigned to the correct test bucket...
...then we:
1.
Re-run our initial check
T229426
, excluding contributors along this "dimension" from our analysis
2.
Conduct our full analysis (
T221198
), also excluding contributors along this "dimension"
3.
Potentially
do a follow up test with contributors along this "dimension"
If we find a
random
subset of contributors did not get assigned to the correct test bucket...
...then we:
1.
Consider our initial check (
T229426
) to be valid and then
2.
Conduct our full analysis
T221198
as planned
matmarex
subscribed.
Sep 11 2019, 5:11 PM
2019-09-11 17:11:06 (UTC+0)
ppelberg
added a subscriber:
Esanders
Sep 11 2019, 7:44 PM
2019-09-11 19:44:17 (UTC+0)
Comment Actions
Documenting a question
@Esanders
posed during today's planning meeting:
What happens when a contributor tries to edit using a browser that VE doesn't support? How – if at all – are they assigned to a test bucket?
marcella
subscribed.
Sep 11 2019, 7:45 PM
2019-09-11 19:45:01 (UTC+0)
DLynch
added a comment.
Sep 11 2019, 7:51 PM
2019-09-11 19:51:43 (UTC+0)
Comment Actions
@ppelberg
They're assigned to the buckets server-side. They then log their bucket as the assigned one, regardless of which editor loads.
MNeisler
moved this task from
Triage
to
Next Up
on the
Product-Analytics
board.
Sep 12 2019, 4:09 PM
2019-09-12 16:09:45 (UTC+0)
MNeisler
moved this task from
Next Up
to
Doing
on the
Product-Analytics
board.
Sep 13 2019, 6:43 PM
2019-09-13 18:43:06 (UTC+0)
MNeisler
added a comment.
Edited
Sep 17 2019, 8:47 PM
2019-09-17 20:47:21 (UTC+0)
Comment Actions
I spent some time yesterday digging into the data to determine if the imbalance was present for any particular subgroup. See summary of findings so far below and full results linked
here
Unable to isolate the imbalance to any of the following subgroups: user group (registered or anonymous), country, wiki or event action. Across all these groups, there was a similar imbalance between the wikitext and visual editor buckets with a higher number of users (around 51% to 54%) being placed in the wikitext bucket. See user bucket numbers for registered and anonymous users below:
Anonymous Users:
bucket
users
default-source
1,891,449
default-visual
1,769,443
Registered Users:
bucket
users
default-source
9,137
default-visual
8,819
There was also a similar imbalance across all of the major browsers; however, I did find significant imbalance between the wikitext and visual editor user buckets for Bingbot and BingPreview browsers (both web crawling bot browsers). For Bingbot, about 98% of the 384 users (both registered and anonymous) were placed in the wikitext as default bucket. This imbalance alone wouldn't account for the total imbalance we are seeing in overall users but it might be a clue into where the issue is happening. Do we filter out bots from the test group?
Since we're seeing the same imbalance towards the wikitext buckets across these various dimensions and with init actions, I'm wondering if this might be a load time issue.
Can we confirm that both buckets assigned on the server-side? Are they both assigned at the same time an init event is recorded on the server? If for some reason, wikitext bucket is assigned on the server-side and visual editor on the client side then that delay might be why we are seeing fewer people in the visual editor buckets.
Any other thoughts or ideas? Let me know if there any other areas you'd like me to help investigate.
DLynch
added a comment.
Sep 17 2019, 9:33 PM
2019-09-17 21:33:25 (UTC+0)
Comment Actions
Buckets are all assigned server-side, yes -- I posted a link to where it's done above.
matmarex
added a comment.
Sep 18 2019, 3:06 AM
2019-09-18 03:06:40 (UTC+0)
Comment Actions
I looked into how we record the events again and I think I noticed something we weren't thinking of before:
events are only recorded after the editor loads
. The code that would record events is downloaded together with the editor code, so it can't record anything until the editor code is downloaded.
Events which are generated before the editor loads, like the 'init' event, are stored it in a temporary queue, and get recorded only after the editor code is loaded.
If the user has a poor network connection and loading the editor fails (or is cancelled by the user), nothing is recorded. Because the size of visual editor code is much larger than the wikitext editor code (it depends on wiki config, but around 6x larger), the effect is much larger for it, and results in this imbalance.
I tested using Chrome dev tools set to simulate "Slow 3G" connection (tested on
). For wikitext editor, the 'init' event was recorded ~4 seconds after the user clicked the edit pencil. For visual editor, it was recorded after ~13 seconds. If the user cancelled loading before that point, or lost their network connection, nothing would have been recorded.
Wikitext
Visual
If we wanted to fix this, it's technically an easy change to load this code earlier:
– but this would noticeably increase the code size of initial page load. I'm not sure how significant the increase is, but it's probably not a good idea to worsen the users' experience just so that we can log some data about them…
I think that's the explanation for the missing users. (Or at least, probably the closest thing to an explanation we're going to get.)
In the data that we have logged now, we should probably treat the missing visual users as if they tried to use the editor but failed to load it. (Note that there would also be missing users for the wikitext editor, but we have no way to tell how many, other than guessing it's a lot fewer than for visual.)
marcella
awarded a token.
Sep 18 2019, 4:04 PM
2019-09-18 16:04:10 (UTC+0)
ppelberg
added a comment.
Edited
Sep 19 2019, 2:40 AM
2019-09-19 02:40:27 (UTC+0)
Comment Actions
Great work,
@DLynch
@MNeisler
and
@matmarex
So it sounds like we ended up in the situation where a
non-random
group of contributors are being affected by this issue. In
T232237#5481099
, we said we could correct for an issue like this in our analysis. Granted at that point, we didn't know what the issue was.
However, after talking with Megan about this today, I got the impression we now have less confidence in statistical methods to correct for this issue in our analysis, which leads me to wonder:
@MNeisler
, are you able to share how our results and our confidence in them might be affected should we move forward with analyzing the data we've gathered so far?
As for the issue itself [2],
@matmarex
, considering you already have a
patch up
and you,
@DLynch
and
@Esanders
seem to have worked out a path forward with it in chat [1]:
Is it safe for me to assume there isn't much – if any – engineering work left to fix the issue by recording events before the editor loads?
Path forward:
"There's not a great deal of harm in pushing it sooner, with async as a decent compromise."
Issue:
The event that records which test bucket a contributor is assigned to gets recorded after the editor loads. This means, contributors are invisible to the test in instances where the editor fails to load.
ppelberg
added a comment.
Sep 19 2019, 6:35 PM
2019-09-19 18:35:15 (UTC+0)
Comment Actions
In
T232237#5505084
@ppelberg
wrote:
As for the issue itself [2],
@matmarex
, considering you already have a
patch up
and you,
@DLynch
and
@Esanders
seem to have worked out a path forward with it in chat [1]:
Is it safe for me to assume there isn't much – if any – engineering work left to fix the issue by recording events before the editor loads?
RE engineering work to be done to resolve the issue...
During today's standup
@DLynch
and
@Esanders
came to the agreement that: there is a minimal amount of engineering work to be done to resolve the issue of contributors' test bucket assignments not being recorded when the editor fails to load.
MNeisler
added a comment.
Sep 24 2019, 4:25 PM
2019-09-24 16:25:16 (UTC+0)
Comment Actions
@MNeisler
, are you able to share how our results and our confidence in them might be affected should we move forward with analyzing the data we've gathered so far?
I discussed this with
@Neil_P._Quinn_WMF
today. There are a couple potential ways we could correct for the identified issue in the analysis if we decided to proceed with analyzing the data as is.
We could add in all the missing visual users treating them all as if they tried to use the editor but failed to load it. The problem with this approach is we don’t know if they were all different users or any other user details associated with these events. This also assumes that we are not missing any users for wikitext due to a load timing issue.
We could subtract randomly selected init sessions from the wikitext bucket to bring the bucket numbers in line. This might artificially increase the edit completion rate a little but would avoid us having to make assumptions about the missing user data.
While either of these options is not perfect (and would include some assumptions we'd need to caveat), I think they are feasible methods for correcting the issue and would not invalidate our overall results if we decided to proceed without the code fix option due to timing concerns.
Mayakp.wiki
subscribed.
Sep 24 2019, 11:15 PM
2019-09-24 23:15:48 (UTC+0)
gerritbot
added a comment.
Sep 30 2019, 6:41 PM
2019-09-30 18:41:30 (UTC+0)
Comment Actions
Change 539933 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/MobileFrontend@master] Load editor EventLogging code earlier
gerritbot
added a project:
Patch-For-Review
Sep 30 2019, 6:41 PM
2019-09-30 18:41:33 (UTC+0)
matmarex
added a comment.
Sep 30 2019, 6:44 PM
2019-09-30 18:44:19 (UTC+0)
Comment Actions
In
T232237#5501940
@matmarex
wrote:
If we wanted to fix this, it's technically an easy change to load this code earlier:
– but this would noticeably increase the code size of initial page load. I'm not sure how significant the increase is, but it's probably not a good idea to worsen the users' experience just so that we can log some data about them…
@Esanders
suggested that we can load this after the initial page load. This should not negatively impact the initial page loading performance, but should allow us to record editor initialization in most cases. The new patch does that.
ppelberg
added a comment.
Sep 30 2019, 10:21 PM
2019-09-30 22:21:14 (UTC+0)
Comment Actions
In
T232237#5535466
@matmarex
wrote:
In
T232237#5501940
@matmarex
wrote:
If we wanted to fix this, it's technically an easy change to load this code earlier:
– but this would noticeably increase the code size of initial page load. I'm not sure how significant the increase is, but it's probably not a good idea to worsen the users' experience just so that we can log some data about them…
@Esanders
suggested that we can load this after the initial page load.
This should not negatively impact the initial page loading performance
, but should allow us to record editor initialization in most cases. The new patch does that.
Thanks for following up on the
@matmarex
. Regarding the assumption this patch, "...should not negatively impact the initial page loading performance..."; what gives us confidence this is the case? The negligible impact this patch would have on the payload of the page? Some testing we've done? Something else?
matmarex
added a comment.
Sep 30 2019, 11:34 PM
2019-09-30 23:34:11 (UTC+0)
Comment Actions
Well honestly it's an educated guess, based on how I believe browsers work. I think folks on the Reading team keep tabs on the page load performance (sorry, I don't know how exactly, I just remember seeing tasks with pretty graphs once or twice when there was a regression), so if it turns out this is wrong, we will be able to notice and revert or reconsider.
ppelberg
updated the task description.
(Show Details)
Edited
Sep 30 2019, 11:36 PM
2019-09-30 23:36:01 (UTC+0)
Comment Actions
Updating the task description to add the
bolded
text to the
"Done"
section:
What might be causing contributors not being assigned to test buckets in a more balanced way?
Contributors' bucket assignments were being recorded
after
the editor finished loading. This means in situations where contributors abandon their edits before the editor loads, no information about them (including their bucket assignment) is stored. See:
T232237#5501940
What are the implications of contributors not being assigned to test buckets in a more balanced way? (e.g. Are the results, and by extension the test, valid?)
There is a percentage of contributors' editing behavior we do not have access to. This makes further analysis more difficult considering we'd need to exclude them from this analysis or make assumptions about their behavior. See:
T232237#5520035
ppelberg
added a comment.
Sep 30 2019, 11:47 PM
2019-09-30 23:47:24 (UTC+0)
Comment Actions
In
T232237#5536288
@matmarex
wrote:
Well honestly it's an educated guess, based on how I believe browsers work. I think folks on the Reading team keep tabs on the page load performance (sorry, I don't know how exactly, I just remember seeing tasks with pretty graphs once or twice when there was a regression), so if it turns out this is wrong, we will be able to notice and revert or reconsider.
Got it – thank you for explaining that. And just to be doubly sure, would I be correct to think any potential negative impact we would see from this change would show up in how long it takes an article, in read mode, to load?
matmarex
added a comment.
Sep 30 2019, 11:48 PM
2019-09-30 23:48:13 (UTC+0)
Comment Actions
Yes.
ppelberg
mentioned this in
T234277: Decide how we move forward with the A/B test
Sep 30 2019, 11:56 PM
2019-09-30 23:56:02 (UTC+0)
ppelberg
added a subscriber:
JTannerWMF
Oct 2 2019, 1:11 AM
2019-10-02 01:11:06 (UTC+0)
Comment Actions
@matmarex
do you know who is best suited to QA
this patch
? QA? Engineering? Product Analytics? cc
@JTannerWMF
@marcella
matmarex
added a comment.
Oct 3 2019, 9:04 PM
2019-10-03 21:04:54 (UTC+0)
Comment Actions
Engineering. (Note that it's still not merged, Krinkle had some concerns on the patch, and I just updated it.)
gerritbot
added a comment.
Oct 4 2019, 7:55 PM
2019-10-04 19:55:25 (UTC+0)
Comment Actions
Change 539933
merged
by jenkins-bot:
[mediawiki/extensions/MobileFrontend@master] Load editor EventLogging code earlier
ReleaseTaggerBot
added a project:
MW-1.35-notes (1.35.0-wmf.1; 2019-10-08)
Oct 4 2019, 8:00 PM
2019-10-04 20:00:34 (UTC+0)
ppelberg
moved this task from
Design and Product Analytics review
to
Engineering QA
on the
VisualEditor (Current work)
board.
Edited
Oct 7 2019, 10:13 PM
2019-10-07 22:13:31 (UTC+0)
Comment Actions
In
T232237#5545272
@matmarex
wrote:
Engineering. (Note that it's still not merged, Krinkle had some concerns on the patch, and I just updated it.)
Noted – thanks for confirming,
@matmarex
Considering this is now merged, I'm moving this to "Engineering QA".
Related to QA: as part of testing, are you able to see if this approach to the patch impacts page load time?
I saw Jon +2'd the patch which leads me to think he might've checked this in his review, but that's a leap, so I'm wanting to be sure.
ppelberg
mentioned this in
T235101: Rerun mobile VE as default A/B test
Oct 9 2019, 3:16 PM
2019-10-09 15:16:45 (UTC+0)
ppelberg
mentioned this in
T235104: QA for EventLogging patch
Oct 9 2019, 3:36 PM
2019-10-09 15:36:00 (UTC+0)
ppelberg
removed
MNeisler
as the assignee of this task.
Oct 10 2019, 10:01 PM
2019-10-10 22:01:45 (UTC+0)
ppelberg
added a subscriber:
MNeisler
matmarex
moved this task from
Engineering QA
to
Product owner review
on the
VisualEditor (Current work)
board.
Oct 15 2019, 5:59 PM
2019-10-15 17:59:51 (UTC+0)
Comment Actions
Seems to work as expected, as far as I can tell, the 'init' event gets logged immediately, before the editor loads.
matmarex
added a comment.
Oct 15 2019, 6:00 PM
2019-10-15 18:00:31 (UTC+0)
Comment Actions
Oh, and I just saw that it also looks good on the Analytics side:
T235104#5573854
matmarex
added a subscriber:
Jdlrobson
Oct 15 2019, 6:03 PM
2019-10-15 18:03:03 (UTC+0)
Comment Actions
In
T232237#5553881
@ppelberg
wrote:
Related to QA: as part of testing, are you able to see if this approach to the patch impacts page load time?
I saw Jon +2'd the patch which leads me to think he might've checked this in his review, but that's a leap, so I'm wanting to be sure.
I can't really tell myself, it doesn't seem slower for me, but of course that doesn't mean much :)
@Jdlrobson
I vaguely recall you had some fancy graphs / tracking for mobile site's performance. Can you help us find where that is, and confirm that
did not have a negative impact?
ppelberg
added a comment.
Edited
Oct 15 2019, 11:50 PM
2019-10-15 23:50:58 (UTC+0)
Comment Actions
In
T232237#5576474
@matmarex
wrote:
Seems to work as expected, as far as I can tell, the 'init' event gets logged immediately, before the editor loads.
! In
T232237#5576479
@matmarex
wrote:
Oh, and I just saw that it also looks good on the Analytics side:
T235104#5573854
Thanks for checking,
@matmarex
. Let's await confirmation from Jon before we call this done.
Jdlrobson
added a comment.
Oct 16 2019, 12:00 AM
2019-10-16 00:00:12 (UTC+0)
Comment Actions
Check out
for all our boards. I think
should have all the information you need.
mmodell
edited projects, added
Product-Analytics (Kanban)
; removed
Product-Analytics
Oct 16 2019, 6:31 PM
2019-10-16 18:31:54 (UTC+0)
mmodell
moved this task from
Next 2 weeks
to
Doing
on the
Product-Analytics (Kanban)
board.
Oct 16 2019, 6:32 PM
2019-10-16 18:32:24 (UTC+0)
matmarex
added a comment.
Oct 16 2019, 9:09 PM
2019-10-16 21:09:34 (UTC+0)
Comment Actions
Thanks!
I guess we're interested in the "Barack Obama on 3G (en.m)" graph (which has loading time data for the favorite test page, heh). You can zoom in on it by clicking the header and choosing "View" from the menu. It shows the last 7 days of data, but we're interested in the change since last week's deployment, so let's adjust to 14 days using the menu in top-right of the page. Direct link to that view:
It seems like a bunch of data between 2019-10-04 and 2019-10-11 is missing, which is unfortunate, because our change was deployed on 2019-10-10 (with the train). But I think we can still say that there's no visible change before and after.
ppelberg
added a comment.
Oct 17 2019, 12:20 AM
2019-10-17 00:20:49 (UTC+0)
Comment Actions
In
T232237#5578104
@Jdlrobson
wrote:
Check out
for all our boards. I think
should have all the information you need.
This is a big help – thank you,
@Jdlrobson
In
T232237#5582154
@matmarex
wrote:
I guess we're interested in the "Barack Obama on 3G (en.m)" graph (which has loading time data for the favorite test page, heh). You can zoom in on it by clicking the header and choosing "View" from the menu. It shows the last 7 days of data, but we're interested in the change since last week's deployment, so let's adjust to 14 days using the menu in top-right of the page. Direct link to that view:
Your explanation combined with having the deployment date right there makes thinking about this easier. Thank you for that,
@matmarex
In
T232237#5582154
@matmarex
wrote:
But I think we can still say that there's no visible change before and after.
As for the potential impact of the change [1], it looks like the average
render [stable]
time was
4.14s
in the 5 days
between 29-Sep and 4-Oct
(to your point, we don't have data between 2019-10-04 and 2019-10-11 ) and then the average
render [stable]
time was
4.27s
in the 5 days following the change being deployed,
11-Oct – 15-Oct
, which looks like a 3% increase.
Assuming
render [stable]
is the correct data point to look at to assess loading time [2], then agreed, a 3% increase seems negligible.
Bartosz, if you see any holes in my thinking above, please let me know. If not, let's call this QA complete.
I'm assuming other code would have been deployed on 2019-10-10 that could've impacted loading performance.
render [stable]
: in my time looking, I wasn't able to locate where this, or any of the other attributes at the footer of the graphs were defined. I looked
here
here
and
here
ppelberg
added a comment.
Oct 18 2019, 8:58 PM
2019-10-18 20:58:33 (UTC+0)
Comment Actions
! In
T232237#5582683
@ppelberg
wrote:
Bartosz, if you see any holes in my thinking above, please let me know. If not, let's call this QA complete.
Bartosz and I just talked about the above in chat. We're going to call QA for this complete.
matmarex
closed this task as
Resolved
Oct 20 2019, 4:53 PM
2019-10-20 16:53:54 (UTC+0)
matmarex
claimed this task.
DannyS712
removed a project:
Patch-For-Review
Nov 18 2019, 11:35 AM
2019-11-18 11:35:03 (UTC+0)
DannyS712
subscribed.
Comment Actions
[batch] remove patch for review tag from resolved tasks
nshahquinn-wmf
mentioned this in
T223339: Re-run metrics from VE on mobile report
Nov 26 2019, 11:57 PM
2019-11-26 23:57:24 (UTC+0)
Jdlrobson
mentioned this in
T226164: Invalid MobileWebSearch events being logged
Jan 20 2020, 4:34 PM
2020-01-20 16:34:28 (UTC+0)
Log In to Comment
Content licensed under Creative Commons Attribution-ShareAlike (CC BY-SA) 4.0 unless otherwise noted; code licensed under GNU General Public License (GPL) 2.0 or later and other open source licenses. By using this site, you agree to the Terms of Use, Privacy Policy, and Code of Conduct.
Wikimedia Foundation
Code of Conduct
Disclaimer
CC-BY-SA
GPL
Credits