[tor-bugs] #25161 [Metrics/CollecTor]: Fix another memory problem with the webstats bulk import

Tor Bug Tracker & Wiki blackhole at torproject.org
Tue Feb 20 18:58:06 UTC 2018


#25161: Fix another memory problem with the webstats bulk import
-------------------------------+--------------------------
 Reporter:  karsten            |          Owner:  iwakeh
     Type:  defect             |         Status:  assigned
 Priority:  Medium             |      Milestone:
Component:  Metrics/CollecTor  |        Version:
 Severity:  Normal             |     Resolution:
 Keywords:                     |  Actual Points:
Parent ID:                     |         Points:
 Reviewer:                     |        Sponsor:
-------------------------------+--------------------------
Changes (by karsten):

 * owner:  karsten => iwakeh


Comment:

 So, even with 64G RAM I'm running into the very same issue:

 {{{
 2018-02-20 16:40:46,425 INFO o.t.c.w.SanitizeWeblogs:108 Processing logs
 for dist.torproject.org on archeotrichon.torproject.org.
 2018-02-20 16:54:39,815 ERROR o.t.c.c.CollecTorMain:71 The webstats module
 failed: null
 java.lang.OutOfMemoryError: null
         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)
         at
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
         at
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
         at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
         at
 java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:598)
         at
 java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:677)
         at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:735)
         at
 java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160)
         at
 java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174)
         at
 java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
         at
 java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
         at
 java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:496)
         at
 org.torproject.collector.webstats.SanitizeWeblogs.findCleanWrite(SanitizeWeblogs.java:113)
         at
 org.torproject.collector.webstats.SanitizeWeblogs.startProcessing(SanitizeWeblogs.java:90)
         at
 org.torproject.collector.cron.CollecTorMain.run(CollecTorMain.java:67)
         at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
         at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
         at
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
         at
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
         at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
         at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
         at java.lang.Thread.run(Thread.java:748)
 Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM
 limit
         at java.util.Arrays.copyOf(Arrays.java:3236)
         at
 java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:118)
         at
 java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
         at
 java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:135)
         at
 org.torproject.descriptor.internal.FileType.decompress(FileType.java:109)
         at
 org.torproject.collector.webstats.SanitizeWeblogs.lineStream(SanitizeWeblogs.java:190)
         at
 org.torproject.collector.webstats.SanitizeWeblogs.lambda$findCleanWrite$1(SanitizeWeblogs.java:111)
         at
 org.torproject.collector.webstats.SanitizeWeblogs$$Lambda$15/894365800.apply(Unknown
 Source)
         at
 java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:267)
         at
 java.util.TreeMap$ValueSpliterator.forEachRemaining(TreeMap.java:2897)
         at
 java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
         at
 java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291)
         at
 java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
         at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
         at
 java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
         at
 java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
         at
 java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
 2018-02-20 16:54:39,917 INFO o.t.c.c.ShutdownHook:23 Shutdown in progress
 ...
 2018-02-20 16:54:39,917 INFO o.t.c.cron.Scheduler:127 Waiting at most 10
 minutes for termination of running tasks ...
 2018-02-20 16:54:39,917 INFO o.t.c.cron.Scheduler:132 Shutdown of all
 scheduled tasks completed successfully.
 2018-02-20 16:54:39,918 INFO o.t.c.c.ShutdownHook:25 Shutdown finished.
 Exiting.
 }}}

 Can you try to optimize that code a little more?

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25161#comment:11>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list