16:59:56 <hellais> #startmeeting OONI dev gathering 2016-04-18
16:59:56 <MeetBot> Meeting started Mon Apr 18 16:59:56 2016 UTC.  The chair is hellais. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:59:56 <MeetBot> Useful Commands: #action #agreed #help #info #idea #link #topic.
17:00:02 <hellais> greetings people!
17:00:08 <hellais> who is here?
17:00:19 * sbs is ehre
17:01:10 * willscott waves
17:01:30 <anadahz> hello
17:04:31 <hellais> excellent
17:04:36 <hellais> what have you been up to last week?
17:05:09 <willscott> I was offline last week. working on catching up on everything
17:08:02 <hellais> I mainly worked on the web_connectivity branch, that is starting to take a pretty decent form at this point. It's quite frustrating though how accessing the same site multiple times in a brief period of time will yield inconsistent results.
17:08:35 <willscott> what part is inconsistent?
17:09:59 <hellais> willscott: like that you will try to do a TCP connect to the IP of the site on port 80 and 3 times you can establish it and once in 4 you get a connection refused
17:10:27 <hellais> or you will do the same exact request multiple times and on occasion you will get a different response
17:10:47 <hellais> and these behaviors are all due to server-side issues and not to tampering
17:11:50 <hellais> so I am making the blocking heuristics a bit more lax and consider, for example, tcp/IP blocking to happen only if both the separate TCP connect + the http request (that includes a TCP connect) fail
17:14:22 <landers> here. got a WIP branch for adding an https endpoint to the oonibackend
17:14:23 <landers> eof
17:15:16 <hellais> nuke: are there any updates from iOS land?
17:15:29 <anadahz> Focused on: Audit, rebuilt and test OONI sysadmin ansible recipes and docker build images, integrate some tests to know when ooni-probe and ooni-backend fails to install, still working on a safe upgrade solution for live (cannonical) ooni-backend --EOF
17:15:49 <landers> also i think hellais wants me to get oonibackend working under some other twisted process manager?
17:15:50 <sbs> I have mered support for HTTPS minus certificate validation to MK. This is a necessary step to support a OONI HTTPS collector. I have also done general MK improvements and iterated over the NDT prototype. EOF
17:16:02 <sbs> s/mered/merged/
17:16:46 <nuke> iOS App 0.1 is practically done. Now working on Android interface to make it like the iOS version
17:17:10 <nuke> Btw, hi everyone :)
17:17:37 <anadahz> hellais: doing a TCP connect to the IP of the site on port 80 can introduce many false positives since most of the websites are running in shared hosting environments
17:18:37 <hellais> landers: regarding the process manager, you should look to see how twistd is invoke here: https://github.com/TheTorProject/ooni-probe/blob/feature/webui/ooni/webui.py
17:19:59 <hellais> anadahz: the purpose of doing the TCP connection is to verify if the blocking is just IP:port based or if there is knowledge of the HTTP protocol
17:20:26 <hellais> no data is actually sent, I just connect and tear down the connection immediately after
17:21:44 <anadahz> hellais: so if the HTTP request returns a block page and a TCP connection to port 80 is successful on a given input what would be the result?
17:22:34 <hellais> anadahz: it depends if the dns response is consistent or not with the control
17:22:49 <hellais> if the dns responses are consistent then we would flag that as being blocked due to 'http'
17:23:05 <hellais> if the dns responses are inconsistent then we would flag it as being blocked due to 'dns'
17:23:31 <anadahz> hellais: and if any of these checks fails?
17:24:01 <hellais> anadahz: what do you mean if they fail?
17:24:09 <hellais> like that you can't get a control measurement?
17:24:22 <anadahz> hellais: if tcp check fails but dns and http succeeds what would be the outcome?
17:25:11 <anadahz> ^ from the probe side
17:25:23 <sbs> hellais: is there a reason why the tcp measurement is not taken opportunistically while connecting for performing the http test?
17:25:35 <hellais> anadahz: that blocking is happening due to tcp_ip based blocking
17:25:38 <hellais> https://github.com/TheTorProject/ooni-probe/blob/feature/web_connectivity/ooni/nettests/blocking/web_connectivity.py#L283
17:25:50 <hellais> ^^ here you can see the logic for determining blocking
17:27:21 <anadahz> hellais: maybe I'm confused with your explanation at willscott about the inconsistent parts
17:27:52 <hellais> sbs: I was evaluating that possibility at the beginning, but for one it's hard to hook the call to socket.connect() inside of the twisted agent (it's burried very deeply inside of the code for the http Agent) and second I actually think it's more robust to do these checks twice, since I noticed that in some cases you will have the tcp connection failing, but in the end the http request actually works fine (a
17:27:57 <hellais> nd this is due to serverside issues)
17:28:25 <anadahz> hellais: I see quite some complecity there so I guess we should a bit careful of how we interpret web_connectivity reports
17:29:35 <sbs> hellais: understood... from the point of view of not doing strange things to reduce fingerprintability, I think it would be better to use perform one connection, but I understand the Twisted related issues...
17:30:14 <sbs> hellais: since in MK it's doable to do connect and tcp without going mad, I think we should do just one connection when we implement web connectivity for MK
17:31:02 <sbs> hellais:
17:31:51 <hellais> sbs: yes I agree that if it's doable to do it cleanly then it makes sense to re-use the connection. It's just that I don't want to add another layer of hacks on top of the existing hacks for the twisted Agent.
17:32:58 <sbs> hellais: yeah, I agree that with ooni it's better to avoid further complicanting the agent
17:32:58 <hellais> sbs: I think though we should perhaps have some retry for anomalous measurements, like consider blocking to be happening only once we have tried connecting to the site in question 2-3 times
17:34:20 <hellais> anadahz: yes I think the interpretation of the derived 'blocking': 'XXX' key should be taken with care. I bet that as we start looking at real results though we will see if there are ways of improving the detection logic
17:36:06 <sbs> hellais: returning to what you said earlier, is my understanding that you have seen a pattern where tcp is more likely to fail than http correct?
17:36:58 <hellais> sbs: in my very limited testing yes
17:37:26 <hellais> I have noticed inconsistencies with http response pages as well, but that is not very prevalent
17:37:34 <hellais> and it fits withing 2 types of categories
17:38:00 <hellais> 1) The site is runing some bad code that will every X requests return 5xx status codes
17:38:40 <hellais> 2) The site is behind a mis-configured load balancer and you sometimes get a certain page and other times you get another (I saw this hapenning quite a bit with domains that are parked on go-daddy)
17:38:50 <sbs> hellais: this is very interesting! on top of my head I cannot immediately think at a reason why something like this could happen
17:45:38 <anadahz> relevant to our discussion a real case: https://paste.debian.net/439292/
17:46:51 <sbs> anadahz: a real case of tcp vs http failure?
17:49:00 <anadahz> sbs: http "failure"
17:52:21 <anadahz> sbs: and the TCP connect test: https://paste.debian.net/439294/
17:53:41 <anadahz> thre is no DNS hijacking happening
17:54:47 <anadahz> hellais: I would like to have this input URL running in web_connectivity test to do a comparison
17:55:44 <sbs> anadahz: mmm, now /me is confused: does this tell that connect is successful and http is hijacked by a proxy?
17:56:00 <anadahz> so DNS: OK, TCP connection on port 80: ok, HTTP: 302 redirect
17:56:23 <hellais> anadahz: do you also have a dns consistency test result?
17:57:22 <sbs> anadahz: okay, but we can explain this with a transparent proxy, right? the other way round (tcp connect fails and http is ok) is more complex to explain
17:58:10 <MightyOctopus> [13ooni-backend] 15hellais pushed 1 new commit to 06feature/web_connectivity: 02https://git.io/vwYqK
17:58:10 <MightyOctopus> 13ooni-backend/06feature/web_connectivity 14dc52a39 15Arturo Filastò: Add monkey patch for bug in twisted RedirectAgent:...
17:58:47 <hellais> yeah I agree this is the "expected" behavior when a transparent proxy or some filtering technology that is looking at HTTP is present
17:59:04 <anadahz> hellais: sbs running a dns consistency test now
18:00:07 <anadahz> ^ https://paste.debian.net/439297/
18:01:05 <anadahz> tampering: false
18:01:51 <hellais> yeah it look ok
18:02:06 <hellais> anyways I guess we should move into next steps since we are already overtime
18:02:23 <anadahz> so this would be a false positive for web connectivity ?
18:03:22 <hellais> I will continue work on the web_connectivity test, work on the measurement kit test: https://github.com/measurement-kit/measurement-kit/issues/403, adding SSL cert validation and possibly also look into the JNI hooks
18:03:34 <hellais> anadahz: no, this would return blocking: true
18:03:35 <hellais> err
18:03:40 <hellais> blocking: http
18:04:11 <hellais> anadahz: it would be this case here: https://github.com/TheTorProject/ooni-probe/blob/feature/web_connectivity/ooni/nettests/blocking/web_connectivity.py#L306
18:05:02 <anadahz> hellais: so all HTTP 302 redirected input will be mentioned as 'blocking: http' ?
18:06:26 <anadahz> like all HTTP URLs that redirect to HTTPS
18:09:04 <sbs> I will work on MK and specifically on NDT test and on adding support for retrieving IP address from Ubuntu and for performing geolocation using geoip
18:09:13 <sbs> EOF
18:09:34 <anadahz> next steps: continue work on OONI infrastructure syadmin and test tasks -EOF
18:09:45 <hellais> anadahz: no, it follows redirects until the end, compute the body length and compares the body length from the client and backend
18:09:52 <hellais> s/backend/test_helper/
18:13:29 <hellais> rumor has it that by using as factor for body length difference 0.7 this leads to a true positive ratio of 95% (http://www3.cs.stonybrook.edu/~phillipa/papers/JLFG14.pdf)
18:15:17 <anadahz> interesting ^
18:16:42 <hellais> if there are no more closing remarks
18:17:29 <hellais> I would say we adjourn, thanks for attending!
18:17:32 <hellais> #endmeeting