DRILL-5270: Improve loading of profiles listing in the WebUI#755
DRILL-5270: Improve loading of profiles listing in the WebUI#755kkhatua wants to merge 4 commits intoapache:masterfrom
Conversation
|
A summary of the performance is available in this comment on the JIRA (DRILL-5270) |
|
For 8266 profiles, when measured from Chrome browser's Network tool: |
fc15c30 to
f7ad29b
Compare
|
@sudheeshkatkam Can you please review the PR? |
| @@ -0,0 +1,53 @@ | |||
| /** | |||
There was a problem hiding this comment.
Please use comment for the header, not javadoc.
| //Provides a threshold above which we report the time to load | ||
| private static final long LISTTIME_THRESHOLD_MSEC = 2000L; | ||
|
|
||
| private static final int DrillSysFileExtSize = DRILL_SYS_FILE_SUFFIX.length(); |
There was a problem hiding this comment.
DrillSysFileExtSize -> drillSysFileExtSize
There was a problem hiding this comment.
I wanted to treat this like a constant, but this makes it confusing as a Class name
| private final AutoCloseableLock writeLock = new AutoCloseableLock(readWriteLock.writeLock()); | ||
|
|
||
| //Provides a threshold above which we report the time to load | ||
| private static final long LISTTIME_THRESHOLD_MSEC = 2000L; |
There was a problem hiding this comment.
LISTTIME_THRESHOLD_MSEC -> LIST_TIME_THRESHOLD_MSEC
| try { | ||
| currBasePathModified = fs.getFileStatus(basePath).getModificationTime(); | ||
| } catch (IOException ioexcp) { | ||
| ioexcp.printStackTrace(); |
There was a problem hiding this comment.
Please do not use printStackTrace()
There was a problem hiding this comment.
Will publish a log message and return an empty iterator for now. Not sure how to bubble up an error to the UI. I'll take a look at how we do so for profile deserialization as a guide
| ioexcp.printStackTrace(); | ||
| } | ||
|
|
||
| //Acquiring lock to avoid reloading for request coming in before completion of profile read |
There was a problem hiding this comment.
- Before reading lock acquirement was enough, with your changes you modify class fields. Since many threads can access this method, you'll end up with raise conditions, also class fields can be cached by threads as well... I think design here should be reconsidered.
- Guava library has several cache implementations. Can we leverage any of them instead of using tree set?
Pinging @vlad since he is working on DRILL-6053 which intends to make changes in the same class to avoid excessive locking to be aware of intended changes.
There was a problem hiding this comment.
I'll provide the explanation below on my design choices. Is there a way I can prevent the threads from caching the fields?
|
|
||
| /** | ||
| * Add profile name to a TreeSet | ||
| * @param profileName |
There was a problem hiding this comment.
Please do not leave @param, @return without description. IDE usually highlights them, asking to add description.
There was a problem hiding this comment.
OK. Will fix this. Eclipse didn't pop it up for me.
| /** | ||
| * Filter for Drill System Files | ||
| */ | ||
| public class DrillSysFilePathFilter implements PathFilter { |
There was a problem hiding this comment.
Please consider using FileSystemUtil which help to create filters. Passing custom filter is also possible.
There was a problem hiding this comment.
Ok. I was thinking of using
List<FileStatus> fileStatuses = DrillFileSystemUtil.listFiles(fs, basePath, false, sysFileSuffixFilter);
|
@arina-ielchiieva I need to rebase this on top of the latest master considering it was originally based on nearly a year old code. When ready, i'll create a new PR or push to this one. Let me know which one works. |
|
The choice for a When Drill detect changes, I tried using the To evict, as I construct the TreeSet, I simply pop the oldest (by filename) entry. The Guava cache options don't seem to provide a way to define the basis on which to evict entries. I believe, @vrozov's work on DRILL-6053 is to address locking during writes specifically. The lock I used (and need) is for reads to ensure that multiple requests don't trigger an expensive FileSystem call for the same state of the PStore.
If the tree exists and no change is detected, ThreadA will use the If the When ThreadB gets the read-lock, it discovers that during the wait, the We're using the |
@arina-ielchiieva my github id is |
|
Thanks, @vrozov. I'll make use of a separate lock for read-only purpose in case of |
Using Hadoop API to filter and reduce profile list load time, with a synchronization lock to avoid reloading from the DFS Using an in-memory treeSet-based cache, maintain the list of most recent profiles. Reload of the profiles is done in the event of any of the following states changing: 1. Modification Time of profile dir 2. Number of profiles in the profile dir 3. Number of profiles requested exceeds existing the currently available list
1. Clocking total time to serialize the profiles. 2. Locking the profileSet until the transformed iterator is returned. 3. Converted transform function and trailing string to one-time init & constant respectively. (Let JVM optimize)
|
Closing this PR in favor of #1250 |
Using Hadoop API to filter and reduce profile list load time
Using an in-memory treeSet-based cache, maintain the list of most recent
profiles.