in one, it'll be scanned and contents cleaned up. The only thing is, as
of right now the folder itself won't be deleted, you'd have to run the
script from a higher directory.
I also fixed a bug where the script would crash instead of setting the P4Client.
I need to fix the script to use a `with` construct so if you terminate the program the P4Client is returned to what it was.
Removed excess input of polling p4. Fixed quiet output. Added directory
removal back in. Made the output a little nicer, added singular and
plural strings, also added directory total output.
Also made sure the error output gets piped and doesn't show up in
console. However, we shouldn't ignore any error output, this should be
accounted for and properly logged. So, this is a TODO.
In case a specific directory was taking a while, I changed it to auto
flush after a specified period of time. Right now autoflush is
automatically disabled, you have to enable it when creating the console.
TODO: I'll probably hook the console up to the stdout and stderr so you
can use ordinary print statements, we'll see. This is desirable for
easily hooking it into an existing module.
Also removed PressEnter. Added global basename function so we can
override which version we're using, right now I'm seeing if
ntpath.basename works for all cases.
Changed output a little bit.
Also just realized it actually should be easy to parse `p4 fstat ...`, I
just need to crab the clientFile output, and this script should be sped
up substantially. I need to figure out the best way to break this down,
don't want it to be called on a huge directory, but each subdirectory to
split up the work. That said, that would miss the top level files. A
good alternative to not waiting is to see if I can grab the process
output while it's working, instead of waiting for it to be done. This
would actually work perfectly; it's just tricky trying to figure out if
I can break this up. This would also still delay the start of the
script. Could do a mix of local and tree based fstat. Start with local
and switch to the tree.
I haven't yet determined a good number of threads to use, we'll see.
Also have to change how the directories are being handled, kind of a
waste to push every directory to the queue, would be faster if the
batches were bigger.
I also still have to work on using fstat across a tree, this will bring
super speed ups. Output is a bit different, parsing is more complex and
how we handle things will be a bit different.
Made the threaded console to batch messages and so I could manually
flush or clear them. At some point I would consider a safety maximum
buffer size that triggers it to auto flush. This worked out really well,
though I have to see why in some cases lines appear to double up still,
could be something with the process not completing when I expect it to.
This is possible a naive thread implementation, since it pushes a
directory for every thread which seems too drastic. I'd like to see how
much better it works without all the context switches. It's also a
matter of figuring out how much to handle yourself before letting
another thread join in. Right now the threads don't branch out too much
since I think they basically do a breadth-first search, though I have to
double check on that.
Still to come, trying to safely work with fstat across multiple
directories. It's fast, but on the console the script would appear to
stall as it parses everything, so I'd still want to break it down
somewhat so you can see the script making visible progress. I would
also prefer this because then console messages wouldn't be so short and
blocky.
Improvements to come!
I was trying to use `p4 have` for speed, but it doesn't seem to work
with files that are added to a changelist but not to a repo. So I had
to resort back to `p4 fstat`.
So the speed of the script is much faster than before, though it
actually still has much room for improvement, it will just be more
complicated. Calling 'p4 fstat' on the entire directory will give you
everything you need up front, it's just they're in depot paths, which
makes thing a little annoying to parse when you have workspace mappings
that move things around so the local path may differ from the depot
path, and it becomes harder to determine 100% that you're referring to
the same file. And I don't want to have to call p4 on every file to be
sure of that, what I'm doing now is the easiest safest way to be sure of
that, as far as I know.
Another way to speed this up is to add thread crawlers, I'm just not yet
sure with HDDs and SSDs how many threads is a good idea to use.