[tor-bugs] #20323 [Metrics/metrics-lib]: avoid httpurl connection and use more robust approach

Tor Bug Tracker & Wiki blackhole at torproject.org
Sat Oct 8 20:50:59 UTC 2016


#20323: avoid httpurl connection and use more robust approach
-------------------------------------+-------------------------------
     Reporter:  iwakeh               |      Owner:  karsten
         Type:  defect               |     Status:  new
     Priority:  Medium               |  Milestone:  metrics-lib 1.5.0
    Component:  Metrics/metrics-lib  |    Version:
     Severity:  Normal               |   Keywords:
Actual Points:                       |  Parent ID:
       Points:                       |   Reviewer:
      Sponsor:                       |
-------------------------------------+-------------------------------
 While testing sync the following happened:
 during the downloading of `recent` from the main collector using the old,
 i.e. DescriptorCollectorImpl, method the download just hung at some point.
 I killed it after 30min, restarted and the collection worked.

 When the hang-up happened again, I stopped the process immediately and re-
 started it using the new method, i.e. DescriptorIndexCollector.  Now,
 FileNotFound warnings were logged and the download finished without
 problems.  The files that were not available anymore had just expired,
 that is CollecTor just cleaned them out of its recent folder, which would
 explain the hang-up of the old method, too.

 === Why can one approach cope with disappearing files?
 The two ways differ in the trace below FileInputStream (marked below;
 needs a wide window to be visible)

 ==== thread dump old method
 {{{
 "CollecTor-Scheduled-Thread-1" #9 daemon prio=5 os_prio=0
 tid=0x00007f49d43c0800 nid=0x25fd runnable [0x00007f49b0f06000]
    java.lang.Thread.State: RUNNABLE
         at java.net.SocketInputStream.socketRead0(Native Method)
         at
 java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
         at java.net.SocketInputStream.read(SocketInputStream.java:170)
         at java.net.SocketInputStream.read(SocketInputStream.java:141)
         at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
         at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:593)
         at sun.security.ssl.InputRecord.read(InputRecord.java:532)
         at
 sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
         - locked <0x000000071dc8b160> (a java.lang.Object)
         at
 sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930)
         at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
         - locked <0x000000071dcbe658> (a sun.security.ssl.AppInputStream)
         at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
         at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
         at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
 <------------------------------------+
         - locked <0x000000071dd26f98> (a java.io.BufferedInputStream)
 |
         at
 sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:552)
 |
         at
 sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609)
 |  Difference
         at
 sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696)
 |
         - locked <0x000000071dd2ae48> (a
 sun.net.www.http.ChunkedInputStream)                                  |
         at java.io.FilterInputStream.read(FilterInputStream.java:133)
 <-------------------------------------+
         at
 sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3336)
         at
 java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:238)
         at
 java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
         at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
         at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
         at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
         at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
         - locked <0x000000071dd2d138> (a java.io.BufferedInputStream)
         at
 org.torproject.descriptor.impl.DescriptorCollectorImpl.fetchRemoteFile(DescriptorCollectorImpl.java:225)
         ...
 }}}

 ==== thread dump new method
 {{{
 "CollecTor-Scheduled-Thread-1" #9 daemon prio=5 os_prio=0
 tid=0x00007f19fc3a9800 nid=0x4060 runnable [0x00007f19e4a3f000]
    java.lang.Thread.State: RUNNABLE
         at java.net.SocketInputStream.socketRead0(Native Method)
         at
 java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
         at java.net.SocketInputStream.read(SocketInputStream.java:170)
         at java.net.SocketInputStream.read(SocketInputStream.java:141)
         at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
         at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:593)
         at sun.security.ssl.InputRecord.read(InputRecord.java:532)
         at
 sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:973)
         - locked <0x0000000734394300> (a java.lang.Object)
         at
 sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:930)
         at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
         - locked <0x00000007343963d0> (a sun.security.ssl.AppInputStream)
         at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
         at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
 <-------------------------------------+
         - locked <0x000000071df8b3b8> (a java.io.BufferedInputStream)
 |
         at sun.net.www.MeteredStream.read(MeteredStream.java:134)
 | Difference
         - locked <0x000000071df8f618> (a sun.net.www.http.KeepAliveStream)
 <------- (*)                        |
         at java.io.FilterInputStream.read(FilterInputStream.java:133)
 <--------------------------------------+
         at
 sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3336)
         at
 sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3329)
         at java.nio.file.Files.copy(Files.java:2908)
         at java.nio.file.Files.copy(Files.java:3027)
         at
 org.torproject.descriptor.index.DescriptorIndexCollector.fetchRemoteFiles(DescriptorIndexCollector.java:88)
         at
 org.torproject.descriptor.index.DescriptorIndexCollector.collectDescriptors(DescriptorIndexCollector.java:61)
         ...
 }}}

 (*) see [http://pag-
 www.gtisc.gatech.edu/chord/examples/jdk/sun/net/www/http/KeepAliveStream.java.html
 KeepAliveStream.java]


 example code from DescriptorIndexCollector ([https://gitweb.torproject.org
 /metrics-
 lib.git/tree/src/main/java/org/torproject/descriptor/index/DescriptorIndexCollector.java?id=38b18e3520ac0adfc7ea4f15332a66fce8f21e5c#n86
 cf. here])
 {{{
 #!java
       try (InputStream is = new URL(baseUrl + "/" + filepathname)
           .openStream()) {
         Files.copy(is, tempDestinationFile.toPath());
 ...
 }}}

 This is also a little shorter than the [https://gitweb.torproject.org
 /metrics-
 lib.git/tree/src/main/java/org/torproject/descriptor/impl/DescriptorCollectorImpl.java?id=38b18e3520ac0adfc7ea4f15332a66fce8f21e5c#n123
 older approach]

 {{{
     try {
       URL url = new URL(urlString);
       huc = (HttpURLConnection) url.openConnection();
       huc.setRequestMethod("GET");
       huc.connect();
       int responseCode = huc.getResponseCode();
       if (responseCode == 200) {
         BufferedReader br = new BufferedReader(new InputStreamReader(
             huc.getInputStream()));
         String line;
         while ((line = br.readLine()) != null) {
           sb.append(line).append("\n");
         }
         br.close();
       }
       ...
 }}}

 === next steps
 * Try to device a test that triggers the above problem.
 * And/Or analyze the underlying sources for the reason.
 * Replace the HttpURLConnection code with the shorter Files.copy of
 InputStream.

 This would probably important for the following tickets, too:
 #8799, #16151

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/20323>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list