Minor cleanup.

Updated readme.
Updated time.clock to time.time, since former was deprecated.
2020-08-12 15:05:50 -07:00 · 2020-08-12 14:55:20 -07:00 · 2020-08-12 14:49:57 -07:00 · 2017-04-20 16:20:43 -07:00 · 2015-06-09 10:15:00 -06:00 · 2015-06-08 21:20:28 -06:00
6 changed files with 851 additions and 276 deletions
--- a/.gitignore
+++ b/.gitignore
@ -0,0 +1 @@
+*.pyc
--- a/README.md
+++ b/README.md
@ -1,8 +1,17 @@
-p4RemoveUnversioned
+p4Tools
 ===================

-Removes unversioned files from perforce repository. Script is in beta, though it works.  It's a little slow due to the way fstat is used, but that will be changed soon, so the speed up should be enormous once that's done, up to 100x or more.  
+Perforce script tools for:
+* Remove unversioned files
+* Parallel sync missing files
+* Parallel sync everything
+* Etc.

-This script does parse __.p4ignore__ ignore files, compiles the fields as regex, and scans every directory and file against the local and parent __.p4ignore__ files.  This is my first time doing something like this, and I just realized this isn't actually correct; I need to update how things are ignored to follow the [spec](http://www.perforce.com/perforce/r12.1/manuals/cmdref/env.P4IGNORE.html), since it's not straight up regex.
+Script is in beta, works well, but still going through continued testing.  There are a few stats at the end, will be putting in more, like number of files/directories checked, so you have an idea how much work was required. One of the reasons this is still in testing is because sometimes the end of the script gets stuck when closing Console logging; I haven't had the time or care to fix this, so it's not considered stable or production ready for at least that reason.
+
+Concerning benchmarks: I used to have a HDD, now a SSD.  So I can't provide valid comparisons to the old numbers until I do them on a computer with a HDD.  That said, this single worker implementation runs faster than the old multi-threaded version.  Can't wait to further update it, will only continue to get faster.
+
+
+~~This script does parse __.p4ignore__ ignore files, compiles the fields as regex, and scans every directory and file against the local and parent __.p4ignore__ files.  This is my first time doing something like this, and I just realized this isn't actually correct; I need to update how things are ignored to follow the [spec](http://www.perforce.com/perforce/r12.1/manuals/cmdref/env.P4IGNORE.html), since it's not straight up regex.~~ I need to re-add this to the newer script.

 **Files are currently permanently deleted, so use this at your own risk.**
--- a/p4Helper.py
+++ b/p4Helper.py
@ -0,0 +1,444 @@
+#!/usr/bin/python
+# -*- coding: utf8 -*-
+# author              : Brian Ernst
+# python_version      : 2.7.6 and 3.4.0
+# =================================
+
+import datetime, inspect, itertools, marshal, multiprocessing, optparse, os, re, stat, subprocess, sys, threading
+
+# trying ntpath, need to test on linux
+import ntpath
+
+
+try: input = raw_input
+except: pass
+
+
+#==============================================================
+re_remove_comment = re.compile( "#.*$" )
+def remove_comment( s ):
+    return re.sub( re_remove_comment, "", s )
+
+def singular_pulural( val, singular, plural ):
+    return singular if val == 1 else plural
+
+
+#==============================================================
+def enum(*sequential, **named):
+    enums = dict(zip(sequential, range(len(sequential))), **named)
+    return type('Enum', (), enums)
+
+MSG = enum('SHUTDOWN', 'PARSE_DIRECTORY', 'RUN_FUNCTION')
+
+p4_ignore = ".p4ignore"
+
+main_pid = os.getpid( )
+
+
+#==============================================================
+#if os.name == 'nt' or sys.platform == 'cygwin'
+def basename( path ):
+    # TODO: import based on platform
+    # https://docs.python.org/2/library/os.path.html
+    # posixpath for UNIX-style paths
+    # ntpath for Windows paths
+    # macpath for old-style MacOS paths
+    # os2emxpath for OS/2 EMX paths
+
+    #return os.path.basename( path )
+    return ntpath.basename( path )
+
+def normpath( path ):
+    return ntpath.normpath( path )
+
+def join( patha, pathb ):
+    return ntpath.join( patha, pathb )
+
+def splitdrive( path ):
+    return ntpath.splitdrive( path )
+
+def p4FriendlyPath(path):
+    """
+    Returns path with sanitized unsupported characters due to filename limitations.
+    """
+    # http://www.perforce.com/perforce/doc.current/manuals/cmdref/filespecs.html#1041962
+    replace_items = {
+        '@' : '%40',
+        '#' : '%23',
+        '*' : '%2A',
+        '%' : '%25'
+    }
+    def replace(c):
+        return replace_items[c] if c in replace_items else c
+    return ''.join(map(replace, path))
+
+#==============================================================
+def get_ignore_list( path, files_to_ignore ):
+    # have to split path and test top directory
+    dirs = path.split( os.sep )
+
+    ignore_list = [  ]
+
+    for i, val in enumerate( dirs ):
+        path_to_find = os.sep.join( dirs[ : i + 1] )
+
+        if path_to_find in files_to_ignore:
+            ignore_list.extend( files_to_ignore[ path_to_find ] )
+
+    return ignore_list
+
+def match_in_ignore_list( path, ignore_list ):
+    for r in ignore_list:
+        if re.match( r, path ):
+            return True
+    return False
+
+
+#==============================================================
+def call_process( args ):
+    return subprocess.call( args, stdout=subprocess.PIPE, stderr=subprocess.PIPE )
+
+def try_call_process( args, path=None ):
+    try:
+        subprocess.check_output( args, shell=False, cwd=path )#, stderr=subprocess.STDOUT )
+        return 0
+    except subprocess.CalledProcessError:
+        return 1
+
+use_bytearray_str_conversion = type( b"str" ) is not str
+def get_str_from_process_stdout( line ):
+    if use_bytearray_str_conversion:
+        return ''.join( map( chr, line ) )
+    else:
+        return line
+
+def parse_info_from_command( args, value, path = None ):
+    """
+
+    :rtype : string
+    """
+    proc = subprocess.Popen( args, stdout=subprocess.PIPE, stderr=subprocess.PIPE, cwd=path )
+    for line in proc.stdout:
+        line = get_str_from_process_stdout( line )
+
+        if not line.startswith( value ):
+            continue
+        return line[ len( value ) : ].strip( )
+    return None
+
+def get_p4_py_results( args, path = None ):
+    results = []
+    proc = subprocess.Popen( 'p4 -G ' + args, stdout=subprocess.PIPE, stderr=subprocess.PIPE, cwd=path )
+    try:
+        while True:
+            output = marshal.load( proc.stdout )
+            results.append( output )
+    except EOFError:
+        pass
+    finally:
+        proc.stdout.close()
+    return results
+
+
+#==============================================================
+def fail_if_no_p4():
+    if call_process( 'p4 -V' ) != 0:
+        print( 'Perforce Command-line Client(p4) is required for this script.' )
+        sys.exit( 1 )
+
+# Keep these in mind if you have issues:
+# https://stackoverflow.com/questions/16557908/getting-output-of-a-process-at-runtime
+# https://stackoverflow.com/questions/4417546/constantly-print-subprocess-output-while-process-is-running
+def get_client_set( path ):
+    files = set( [ ] )
+
+    make_drive_upper = True if os.name == 'nt' or sys.platform == 'cygwin' else False
+
+    command = "p4 fstat ..."
+
+    proc = subprocess.Popen( command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, cwd=path )
+    for line in proc.stdout:
+        line = get_str_from_process_stdout( line )
+
+        clientFile_tag = "... clientFile "
+        if not line.startswith( clientFile_tag ):
+            continue
+
+        local_path = normpath( line[ len( clientFile_tag ) : ].strip( ) )
+        if make_drive_upper:
+            drive, path = splitdrive( local_path )
+            local_path = ''.join( [ drive.upper( ), path ] )
+
+        files.add( local_path )
+
+    proc.wait( )
+
+    for line in proc.stderr:
+        if "no such file" in line:
+            continue
+        raise Exception(line)
+
+    return files
+
+def get_client_root( ):
+    """
+
+    :rtype : string
+    """
+    command = "p4 info"
+
+    proc = subprocess.Popen( command, stdout=subprocess.PIPE, stderr=subprocess.PIPE )
+    for line in proc.stdout:
+        line = get_str_from_process_stdout( line )
+
+        clientFile_tag = "Client root: "
+        if not line.startswith( clientFile_tag ):
+            continue
+
+        local_path = normpath( line[ len( clientFile_tag ) : ].strip( ) )
+        if local_path == "null":
+            local_path = None
+        return local_path
+    return None
+
+class P4Workspace:
+    """
+    Use this class when working in a workspace. 
+    Makes sure the environmentals are setup correctly, and that you aren't working on a non-perforce directory accidentally;
+    otherwise you can delete files that shouldn't be deleted. Ex:
+
+    with P4Workspace( cwd ): #sets current workspace to cwd, or fails
+        # do stuff here
+    # on exit reverts to previous set workspace
+    """
+
+    def __init__( self, directory):
+        self.directory = directory
+    
+    def __enter__( self ):
+        # get user
+        #print("\nChecking p4 info...")
+        result = get_p4_py_results('info')
+        if len(result) == 0 or b'userName' not in result[0].keys():
+            print("Can't find perforce info, is it even setup? Possibly can't connect to server.")
+            sys.exit(1)
+        username = get_str_from_process_stdout(result[0][b'userName'])
+        client_host = get_str_from_process_stdout(result[0][b'clientHost'])
+        #print("|Done.")
+
+        # see if current directory is set to current workspace, if not, set it to current workspace. 
+        client_root = get_client_root()
+        ldirectory = self.directory.lower()
+        oldworkspace_name = None
+
+        # If workspace root is null, it could be because there are multiple views and not a single root.
+        if client_root is None:
+            results = get_p4_py_results('where', self.directory)
+            for result in results:
+                path = result[b'path']
+                path = re.sub('...$', '', path)
+                path = normpath(path)
+                if ldirectory.startswith(path.lower()):
+                    client_root = path
+                    break
+
+        if client_root is None or not ldirectory.startswith(client_root.lower()):
+            #print("\nCurrent directory not in client view, checking other workspaces for user '" + username + "' ...")
+
+            oldworkspace_name = parse_info_from_command('p4 info', 'Client name: ')
+
+            # get user workspaces
+            results = get_p4_py_results('workspaces -u ' + username)
+            workspaces = []
+            for r in results:
+                whost = get_str_from_process_stdout(r[b'Host'])
+                if whost is not None and len(whost) != 0 and client_host != whost:
+                    continue
+                workspace = {'root': get_str_from_process_stdout(r[b'Root']), 'name': get_str_from_process_stdout(r[b'client'])}
+                workspaces.append(workspace)
+
+            del results
+
+            # check current directory against current workspace, see if it matches existing workspaces.
+            for w in workspaces:
+                wname = w['name']
+                wlower = w['root'].lower()
+                if ldirectory.startswith(wlower):
+                    # set current directory, don't forget to revert it back to the existing one
+                    #print("|Setting client view to: " + wname)
+
+                    if try_call_process( 'p4 set P4CLIENT=' + wname ):
+                        #print("|There was a problem trying to set the p4 client view (workspace).")
+                        sys.exit(1)
+                    break
+            else:
+                # TODO: look up workspace/users for this computer
+                print( "Couldn't find a workspace root that matches the current directory for the current user." )
+                sys.exit(1)
+            #print("|Done.")
+        self.oldworkspace_name = oldworkspace_name
+        return self
+
+    def __exit__( self, type, value, tb ):
+        # If we changed the current workspace, switch it back.
+        if self.oldworkspace_name is not None:
+            #c.write("\nReverting back to original client view...")
+            # set workspace back to the original one
+            if try_call_process( 'p4 set P4CLIENT=' + self.oldworkspace_name ):
+            #    error_count += 1 # have console log errors
+            #    if not options.quiet:
+                print("There was a problem trying to restore the set p4 client view (workspace).")
+                sys.exit(1)
+            #else:
+            #    if not options.quiet:
+            #        c.write("|Reverted client view back to '" + self.oldworkspace_name + "'.")
+            #if not options.quiet:
+            #    c.write("|Done.")
+
+#==============================================================
+class PTable( list ):
+    def __init__( self, *args ):
+        list.__init__( self, args )
+        self.mutex = multiprocessing.Semaphore( )
+
+class PDict( dict ):
+    def __init__( self, *args ):
+        dict.__init__( self, args )
+        self.mutex = multiprocessing.Semaphore( )
+
+
+#==============================================================
+# TODO: Create a child thread for triggering autoflush events
+# TODO: Hook console into stdout so it catches print
+class Console( threading.Thread ):
+    MSG = enum('WRITE', 'FLUSH', 'SHUTDOWN', 'CLEAR' )
+
+    @staticmethod
+    def wake(thread):
+        thread.flush()
+        if not thread.shutting_down:
+            thread.wake_thread = threading.Timer(thread.auto_flush_time / 1000.0, Console.wake, [thread])
+            thread.wake_thread.daemon = True
+            thread.wake_thread.start()
+
+    # auto_flush_time is time in milliseconds since last flush to trigger a flush when writing
+    def __init__( self, auto_flush_num = None, auto_flush_time = None ):
+        threading.Thread.__init__( self )
+        self.buffers = {}
+        self.buffer_write_times = {}
+        self.running = True
+        self.queue = multiprocessing.JoinableQueue( )
+        self.auto_flush_num = auto_flush_num if auto_flush_num is not None else -1
+        self.auto_flush_time = auto_flush_time * 1000 if auto_flush_time is not None else -1
+        self.shutting_down = False
+        self.wake_thread = None
+        if self.auto_flush_time > 0:
+            Console.wake(self)
+
+    def write( self, data, pid = None ):
+        pid = pid if pid is not None else threading.current_thread().ident
+        self.queue.put( ( Console.MSG.WRITE, pid, data ) )
+
+    def writeflush( self, data, pid = None ):
+        pid = pid if pid is not None else threading.current_thread().ident
+        self.queue.put( ( Console.MSG.WRITE, pid, data ) )
+        self.queue.put( ( Console.MSG.FLUSH, pid ) )
+
+    def flush( self, pid = None ):
+        pid = pid if pid is not None else threading.current_thread().ident
+        self.queue.put( ( Console.MSG.FLUSH, pid ) )
+
+    def clear( self, pid = None ):
+        pid = pid if pid is not None else threading.current_thread().ident
+        self.queue.put( ( Console.MSG.CLEAR, pid ) )
+
+    def __enter__( self ):
+        self.start( )
+        return self
+
+    def __exit__( self, type, value, tb ):
+        self.shutting_down = True
+        if self.wake_thread:
+            self.wake_thread.cancel()
+            self.wake_thread.join()
+        self.queue.put( ( Console.MSG.SHUTDOWN, ) )
+        self.queue.join( )
+
+    def run( self ):
+        while True:
+            data = self.queue.get( )
+            event = data[0]
+
+            if event == Console.MSG.SHUTDOWN:
+                for ( pid, buffer ) in self.buffers.items( ):
+                    for line in buffer:
+                        print( line )
+                self.buffers.clear( )
+                self.buffer_write_times.clear( )
+                self.queue.task_done( )
+                break
+
+            elif event == Console.MSG.WRITE:
+                pid, s = data[ 1 : ]
+
+                if pid not in self.buffers:
+                    self.buffers[ pid ] = []
+                if pid not in self.buffer_write_times:
+                    self.buffer_write_times[ pid ] = datetime.datetime.now( )
+                self.buffers[ pid ].append( s )
+
+                try:
+                    if self.auto_flush_num >= 0 and len( self.buffers[ pid ] ) >= self.auto_flush_num:
+                        self.flush( pid )
+                    elif self.auto_flush_time >= 0 and ( datetime.datetime.now( ) - self.buffer_write_times[ pid ] ).microseconds >= self.auto_flush_time:
+                        self.flush( pid )
+                except TypeError:
+                    print('"' + pid + '"')
+                    raise
+                # TODO: if buffer is not empty and we don't auto flush on write, sleep until a time then auto flush according to auto_flush_time
+            elif event == Console.MSG.FLUSH:
+                pid = data[ 1 ]
+                if pid in self.buffers:
+                    buffer = self.buffers[ pid ]
+                    for line in buffer:
+                        print( line )
+                    self.buffers.pop( pid, None )
+                    self.buffer_write_times[ pid ] = datetime.datetime.now( )
+            elif event == Console.MSG.CLEAR:
+                pid = data[ 1 ]
+                if pid in self.buffers:
+                    self.buffers.pop( pid, None )
+
+            self.queue.task_done( )
+
+
+#==============================================================
+# class Task( threading.Event ):
+#     def __init__( data, cmd = None ):
+#         threading.Event.__init__( self )
+
+#         self.cmd = cmd if cmd is None MSG.RUN_FUNCTION
+#         self.data = data
+
+#     def isDone( self ):
+#         return self.isSet()
+
+#     def join( self ):
+#         self.wait( )
+
+
+#==============================================================
+class Worker( threading.Thread ):
+    def __init__( self, console, queue, commands ):
+        threading.Thread.__init__( self )
+        
+        self.queue = queue
+        self.commands = commands
+
+    def run( self ):
+        while True:
+            ( cmd, data ) = self.queue.get( )
+            if not self.commands[cmd](data):
+                self.queue.task_done( )
+                break
+            self.queue.task_done( )
--- a/p4RemoveUnversioned.py
+++ b/p4RemoveUnversioned.py
@ -4,311 +4,153 @@
 # python_version      : 2.7.6 and 3.4.0
 # =================================

-# todo: switch to `p4 fstat ...`, and parse the output for clientFile and cache it.
 # todo: have a backup feature, make sure files are moved to the recycle bin or a temporary file.
-# todo: switch to faster method of calling p4 fstat on an entire directory and parsing it's output
 # todo: add option of using send2trash
-# todo: buffer output, after exceeding a certain amount print to the output.
+# todo: switch to faster method of calling p4 fstat on an entire directory and parsing it's output
 # todo: allow logging output besides console output, or redirection altogether

-import inspect, multiprocessing, optparse, os, re, stat, subprocess, sys, threading, traceback
+from p4Helper import *

-# trying ntpath, need to test on linux
-import ntpath
+import time, traceback


-re_remove_comment = re.compile( "#.*$" )
-def remove_comment( s ):
-    return re.sub( re_remove_comment, "", s )
-
-
-try: input = raw_input
-except: pass
-
-def enum(*sequential, **named):
-    enums = dict(zip(sequential, range(len(sequential))), **named)
-    return type('Enum', (), enums)
-
-MSG = enum('SHUTDOWN', 'PARSE_DIRECTORY')
-
-p4_ignore = ".p4ignore"
-
-main_pid = os.getpid( )
-
-
-def basename( path ):
-    #return os.path.basename( path )
-    return ntpath.basename( path )
-
-def get_ignore_list( path, files_to_ignore ):
-    # have to split path and test top directory
-    dirs = path.split( os.sep )
-
-    ignore_list = [  ]
-
-    for i, val in enumerate( dirs ):
-        path_to_find = os.sep.join( dirs[ : i + 1] )
-
-        if path_to_find in files_to_ignore:
-            ignore_list.extend( files_to_ignore[ path_to_find ] )
-
-    return ignore_list
-
-def match_in_ignore_list( path, ignore_list ):
-    for r in ignore_list:
-        if re.match( r, path ):
-            return True
-    return False
-
-class PTable( list ):
-    def __init__( self, *args ):
-        list.__init__( self, args )
-        self.mutex = multiprocessing.Semaphore( )
-
-class PDict( dict ):
-    def __init__( self, *args ):
-        dict.__init__( self, args )
-        self.mutex = multiprocessing.Semaphore( )
-
-class Console( threading.Thread ):
-    MSG = enum('WRITE', 'FLUSH', 'SHUTDOWN', 'CLEAR' )
-
-    def __init__( self ):
-        threading.Thread.__init__( self )
-        self.buffers = {}
-        self.running = True
-        self.queue = multiprocessing.JoinableQueue( )
-
-    def write( self, data ):
-        self.queue.put( ( Console.MSG.WRITE, os.getpid(), data ) )
-
-    def flush( self ):
-        self.queue.put( ( Console.MSG.FLUSH, os.getpid() ) )
-
-    def clear( self ):
-        self.queue.put( ( Console.MSG.CLEAR, os.getpid() ) )
-
-    def __enter__( self ):
-        self.start( )
-        return self
-
-    def __exit__( self, type, value, tb ):
-        self.queue.put( ( Console.MSG.SHUTDOWN, ) )
-        self.queue.join( )
-
-    def run( self ):
-        while True:
-            data = self.queue.get( )
-            event = data[0]
-
-            if event == Console.MSG.SHUTDOWN:
-                # flush remaining buffers before shutting down
-                for ( pid, buffer ) in self.buffers.iteritems( ):
-                    for line in buffer:
-                        print( line )
-                self.buffers.clear( )
-                self.queue.task_done( )
-                break
-
-            elif event == Console.MSG.WRITE:
-                pid, s = data[ 1 : ]
-
-                if pid not in self.buffers:
-                    self.buffers[ pid ] = []
-                self.buffers[ pid ].append( s )
-            elif event == Console.MSG.FLUSH:
-                pid = data[ 1 ]
-                if pid in self.buffers:
-                    for line in self.buffers[ pid ]:
-                        print( line )
-                    self.buffers.pop( pid, None )
-            elif event == Console.MSG.CLEAR:
-                pid = data[ 1 ]
-                if pid in self.buffers:
-                    self.buffers.pop( pid, None )
-
-            self.queue.task_done( )
-
-class Worker( threading.Thread ):
-    def __init__( self, console, queue, files_to_ignore ):
-        threading.Thread.__init__( self )
-
-        self.console = console
-        self.queue = queue
-        self.files_to_ignore = files_to_ignore
-
-    def run( self ):
-        while True:
-            ( cmd, data ) = self.queue.get( )
-
-            if cmd == MSG.SHUTDOWN:
-                self.queue.task_done( )
-                self.console.flush( )
-                break
-
-            if cmd != MSG.PARSE_DIRECTORY or data is None:
-                self.console.flush( )
-                self.queue.task_done( )
-                continue
-
-            directory = data
-
-            self.console.write( "Working on " + directory )
-
-            dir_contents = os.listdir( directory )
-
-            if p4_ignore in dir_contents:
-                file_regexes = []
-                # Should automatically ignore .p4ignore even if it's not specified, otherwise it'll be deleted.
-                path = os.path.join( directory, p4_ignore )
-                with open( path ) as f:
-                    for line in f:
-                        new_line = remove_comment( line.strip( ) )
-                        if len( new_line ) > 0:
-                            file_regexes.append( re.compile( os.path.join( re.escape( directory + os.sep ), new_line ) ) )
-
-                self.console.write( "| Appending ignores from " + path )
-                with self.files_to_ignore.mutex:
-                    if directory not in self.files_to_ignore:
-                        self.files_to_ignore[ directory ] = []
-                    self.files_to_ignore[ directory ].extend( file_regexes )
-
-
-            ignore_list = get_ignore_list( directory, self.files_to_ignore )
-
-
-            files = []
-            command = "p4 fstat *"
-
-            try:
-                proc = subprocess.Popen( command.split( ), stdout=subprocess.PIPE, stderr=subprocess.PIPE, cwd=directory )
-                (out, err) = proc.communicate()
-            except Exception as ex:
-                self.console.write( "| " + type( ex ) )
-                self.console.write( "| " + ex.args )
-                self.console.write( "| " + ex )
-                self.console.write( "|ERROR." )
-                self.console.flush( )
-                self.queue.task_done( )
-                continue
-
-            for line in err.decode('utf-8').split( os.linesep ):
-                if len( line ) == 0:
-                    continue
-
-                # # dirty hack that grabs the filename from the ends of the printed out (not err) "depo_path - local_path"
-                # # I could use regex to verify the expected string, but that will just slow us down.
-                # base = basename( line )
-                i = line.rfind( ' - ')
-                if i >= 0:
-                    base = line[ : i ]
-                    if base == "*" or len(base) == 0:
-                        # Directory is empty, we could delete it now
-                        continue
-                    path = os.path.join( directory, base )
-
-                    if not os.path.isdir( path ):
-                        files.append( base )
-
-            for content in dir_contents:
-                path = os.path.join( directory, content )
-                if os.path.isdir( path ):
-                    if match_in_ignore_list( path, ignore_list ):
-                        self.console.write( "| Ignoring " + content )
-                    else:
-                        self.queue.put( ( MSG.PARSE_DIRECTORY, path ) )
-
-            for file in files:
-                path = os.path.join( directory, file )
-
-                if match_in_ignore_list( path, ignore_list ):
-                    self.console.write( "| Ignoring " + path )
-                    continue
-
-                self.console.write( "| " + file + " is unversioned, removing it." )
-                os.chmod( path, stat.S_IWRITE )
-                os.remove( path )
-
-            self.console.write( "|Done." )
-            self.console.flush( )
-
-            self.queue.task_done( )
-
+#==============================================================
 def main( args ):
-    # check requirements
-    if os.system( 'p4 > Nul' ) != 0:
-        print( 'Perforce Command-line Client(p4) is required for this script.' )
-        sys.exit( 1 )
+    start = time.time()
+
+    fail_if_no_p4()

    #http://docs.python.org/library/optparse.html
    parser = optparse.OptionParser( )

    parser.add_option( "-d", "--dir", dest="directory", help="Desired directory to crawl.", default=None )
    parser.add_option( "-t", "--threads", dest="thread_count", help="Number of threads to crawl your drive and poll p4.", default=100 )
-    parser.add_option( "-q", "--quiet", action="store_false", dest="quiet", default=False )
+    parser.add_option( "-q", "--quiet", action="store_true", dest="quiet", help="This overrides verbose", default=False )
    parser.add_option( "-v", "--verbose", action="store_true", dest="verbose", default=True )
+    parser.add_option( "-i", "--interactive", action="store_true", dest="interactive", default=False )

-    ( options, args ) = parser.parse_args( )
+    ( options, args ) = parser.parse_args( args )

-    root_full_path = os.getcwd( )
+    directory = normpath( options.directory if options.directory is not None else os.getcwd( ) )

-    # Files are added from .p4ignore
-    # Key is the file root, the value is the table of file regexes for that directory.
-    files_to_ignore = PDict()
+    with Console( auto_flush_time=1 ) as c:
+        with P4Workspace( directory ):
+            # Files are added from .p4ignore
+            # Key is the file root, the value is the table of file regexes for that directory.
+            files_to_ignore = PDict()

-    # make sure script doesn't delete itself
-    with files_to_ignore.mutex:
-        files_to_ignore[ root_full_path ] = [ re.compile( re.escape( os.path.join( root_full_path, basename( __file__ ) ) ) ) ]
+            processed_file_count = 0
+            processed_directory_count = 0
+            
+            remove_file_count = 0
+            remove_dir_count = 0
+            warning_count = 0
+            error_count = 0

-    # Setup threading
-    threads = []
-    thread_count = options.thread_count if options.thread_count > 0 else multiprocessing.cpu_count( ) + threads
+            if not options.quiet:
+                c.writeflush( "\nCaching files in depot, this may take a little while..." )

-    queue = multiprocessing.JoinableQueue( )
+            # TODO: push this off to a thread and walk the directory so we get a headstart.
+            files_in_depot = get_client_set( directory )

-    with Console() as c:
-        for i in range( thread_count ):
-            t = Worker( c, queue, files_to_ignore )
-            threads.append( t )
-            t.start( )
+            if not options.quiet:
+                c.writeflush( "|Done." )

-        if len( threads ) == 1:
-            print( "Spawned %s thread." % len( threads ) )
-        else:
-            print( "Spawned %s threads." % len( threads ) )
+            # TODO: push a os.walk request off to a thread to build a list of files in the directory; create batch based on directory?

-        queue.put( ( MSG.PARSE_DIRECTORY, options.directory if options.directory is not None else os.getcwd( ) ) )
-        queue.join( )
+            # TODO: at this point join on both tasks to wait until they're done

-        for i in range( thread_count ):
-            queue.put( ( MSG.SHUTDOWN, None ) )
+            # TODO: kick off file removal, make batches from the files for threads to work on since testing has to be done for each.
+            #       need to figure out the best way to do this since the ignore list needs to be properly built for each directory;
+            #       will at least need to redo how the ignore lists are handled for efficiencies sake.

-        print( os.linesep + "Removing empty directories...")
-        # remove empty directories in reverse order
-        for root, dirs, files in os.walk( root_full_path, topdown=False ):
-            ignore_list = get_ignore_list( root, files_to_ignore )
+            if not options.quiet:
+                c.writeflush( "\nChecking " + directory)
+            for root, dirs, files in os.walk( directory ):
+                ignore_list = get_ignore_list( root, files_to_ignore )

-            for d in dirs:
-                path = os.path.join( root, d )
+                if not options.quiet:
+                    c.write( "|Checking " + os.path.relpath( root, directory ) )

-                if match_in_ignore_list( path, ignore_list ):
-                    # add option of using send2trash
-                    print( "| ignoring " + d )
-                    dirs.remove( d )
-                try:
-                    os.rmdir(path)
-                    print( "| " + d + " was removed." )
-                except OSError:
-                    # Fails on non-empty directory
-                    pass
-        print( "|Done." )
+                for d in dirs:
+                    processed_directory_count += 1
+                    path = join( root, d )
+                    rel_path = os.path.relpath( path, directory )

-        for t in threads:
-            t.join( )
+                    if match_in_ignore_list( path, ignore_list ):
+                        # add option of using send2trash
+                        if not options.quiet:
+                            c.write( "| ignoring " + rel_path )
+                        dirs.remove( d )
+
+                for f in files:
+                    processed_file_count += 1
+                    path = normpath( join( root, f ) )
+
+                    if path not in files_in_depot:
+                        if not options.quiet:
+                            c.write( "| " + f + " is unversioned, removing it." )
+                        try:
+                            os.chmod( path, stat.S_IWRITE )
+                            os.remove( path )
+                            remove_file_count += 1
+                        except OSError as ex:
+                            c.writeflush( "|  " + type( ex ).__name__ )
+                            c.writeflush( "|  " + repr( ex ) )
+                            c.writeflush( "|  ^ERROR^" )
+
+                            error_count += 1
+            if not options.quiet:
+                c.write( "|Done." )
+
+            if not options.quiet:
+                c.write( os.linesep + "Removing empty directories...")
+            # remove empty directories in reverse order
+            for root, dirs, files in os.walk( directory, topdown=False ):
+                ignore_list = get_ignore_list( root, files_to_ignore )
+
+                for d in dirs:
+                    processed_directory_count += 1
+                    path = os.path.join( root, d )
+                    rel_path = os.path.relpath( path, directory )
+
+                    if match_in_ignore_list( path, ignore_list ):
+                        # add option of using send2trash
+                        if not options.quiet:
+                            c.write( "| ignoring " + rel_path )
+                        dirs.remove( d )
+                    try:
+                        os.rmdir(path)
+                        remove_dir_count += 1
+                        if not options.quiet:
+                            c.write( "| " + rel_path + " was removed." )
+                    except OSError:
+                        # Fails on non-empty directory
+                        pass
+            if not options.quiet:
+                c.write( "|Done." )
+
+            if not options.quiet:
+                output = "\nChecked " + str( processed_file_count ) + singular_pulural( processed_file_count, " file, ", " files, " )
+                output += str( processed_directory_count ) + singular_pulural( processed_directory_count, " directory", " directories")
+                
+                output += "\nRemoved " + str( remove_file_count ) + singular_pulural( remove_file_count, " file, ", " files, " )
+                output += str( remove_dir_count ) + singular_pulural( remove_dir_count, " directory", " directories")
+
+                if warning_count > 0:
+                    output += " w/ " + str( warning_count ) + singular_pulural( warning_count, " warning", " warnings" )
+                if error_count > 0:
+                    output += " w/ " + str( error_count ) + singular_pulural( error_count, " error", " errors" )
+
+                end = time.time()
+                delta = end - start
+                output += "\nFinished in " + str(delta) + "s"
+
+                c.write( output )

 if __name__ == "__main__":
    try:
        main( sys.argv )
    except:
-        print( "Unexpected error!" )
+        print( "\nUnexpected error!" )
        traceback.print_exc( file = sys.stdout )
--- a/p4Sync.py
+++ b/p4Sync.py
@ -0,0 +1,60 @@
+#!/usr/bin/python
+# -*- coding: utf8 -*-
+# author              : Brian Ernst
+# python_version      : 2.7.6 and 3.4.0
+# =================================
+
+from p4Helper import *
+
+import multiprocessing, subprocess, time, traceback
+
+
+#==============================================================
+class P4Sync:
+    def run( self, args ):
+        start = time.time()
+
+        fail_if_no_p4()
+
+         #http://docs.python.org/library/optparse.html
+        parser = optparse.OptionParser( )
+
+        parser.add_option( "-d", "--dir", dest="directory", help="Desired directory to crawl.", default=None )
+        parser.add_option( "-t", "--threads", dest="thread_count", help="Number of threads to crawl your drive and poll p4.", default=0 )
+        parser.add_option( "-f", "--force", action="store_true", dest="force", help="Force sync files, even if you already have them.", default=False )
+        parser.add_option( "-q", "--quiet", action="store_true", dest="quiet", help="This overrides verbose", default=False )
+        parser.add_option( "-v", "--verbose", action="store_true", dest="verbose", default=False )
+
+        ( options, args ) = parser.parse_args( args )
+
+        directory = normpath( options.directory if options.directory is not None else os.getcwd( ) )
+        thread_count = int(options.thread_count if options.thread_count > 0 else multiprocessing.cpu_count( ) + options.thread_count)
+
+        with Console( auto_flush_time=1 ) as c:
+            with P4Workspace( directory ):
+                if not options.quiet:
+                    c.writeflush( "Syncing files..." )
+                try:
+                    # in progress, very ugly right now.
+                    cmd = "p4 " + \
+                        ( "-vnet.maxwait=60 " if thread_count > 1 else '' ) + \
+                        "-r 100000 sync " + \
+                        ('-f ' if options.force else '') + \
+                        ("--parallel=threads=" + str(thread_count) + " " if thread_count > 1 else '') + \
+                        os.path.join(directory, "...")
+                    subprocess.check_output( cmd, shell=True )
+                except subprocess.CalledProcessError:
+                    pass
+        if not options.quiet:
+            end = time.time()
+            delta = end - start
+            output = " Done. Finished in " + str(delta) + "s"
+            print( output )
+
+
+if __name__ == "__main__":
+    try:
+        P4Sync().run(sys.argv)
+    except:
+        print( "\nUnexpected error!" )
+        traceback.print_exc( file = sys.stdout )
--- a/p4SyncMissingFiles.py
+++ b/p4SyncMissingFiles.py
@ -0,0 +1,219 @@
+#!/usr/bin/python
+# -*- coding: utf8 -*-
+# author              : Brian Ernst
+# python_version      : 2.7.6 and 3.4.0
+# =================================
+
+# TODO: setup batches before pushing to threads and use p4 --parallel
+# http://www.perforce.com/perforce/r14.2/manuals/cmdref/p4_sync.html
+
+from p4Helper import *
+
+import time, traceback
+
+
+#==============================================================
+class P4SyncMissing:
+    def run( self, args ):
+        start = time.time()
+
+        fail_if_no_p4()
+
+        #http://docs.python.org/library/optparse.html
+        parser = optparse.OptionParser( )
+
+        parser.add_option( "-d", "--dir", dest="directory", help="Desired directory to crawl.", default=None )
+        parser.add_option( "-t", "--threads", dest="thread_count", help="Number of threads to crawl your drive and poll p4.", default=12 )
+        parser.add_option( "-q", "--quiet", action="store_true", dest="quiet", help="This overrides verbose", default=False )
+        parser.add_option( "-v", "--verbose", action="store_true", dest="verbose", default=False )
+
+        ( options, args ) = parser.parse_args( args )
+
+        directory = normpath( options.directory if options.directory is not None else os.getcwd( ) )
+
+        with Console( auto_flush_time=1 ) as c:
+            with P4Workspace( directory ):
+                if not options.quiet:
+                    c.writeflush( "Retreiving missing files..." )
+                    c.writeflush( " Setting up threads..." )
+
+                # Setup threading
+                WRK = enum( 'SHUTDOWN', 'SYNC' )
+
+                def shutdown( data ):
+                    return False
+                def sync( files ):
+                    files_len = len(files)
+                    files_flat = ' '.join('"' + p4FriendlyPath( f ) + '"' for f in files)
+
+                    if options.verbose:
+                        files_len = len(files)
+                        if files_len > 1:
+                            c.write( "  Syncing batch of " + str(len(files)) + " ...")
+                            for f in files:
+                                c.write( "   " + os.path.relpath( f, directory ) )
+                        else:
+                            for f in files:
+                                c.write( "   Syncing " + os.path.relpath( f, directory ) + " ..." )
+
+                    ret = -1
+                    count = 0
+                    while ret != 0 and count < 2:
+                        ret = try_call_process( "p4 sync -f " + files_flat )
+                        count += 1
+                        if ret != 0 and not options.quiet:
+                            c.write("Failed, trying again to sync " + files_flat)
+                    if ret != 0:
+                        if not options.quiet:
+                            c.write("Failed to sync " + files_flat)
+                    else:
+                        if not options.quiet:
+                            files_len = len(files)
+                            if files_len > 1:
+                                c.write( "  Synced batch of " + str(len(files)) )
+                            for f in files:
+                                c.write( "   Synced " + os.path.relpath( f, directory ) )
+                    return True
+
+                commands = {
+                    WRK.SHUTDOWN : shutdown,
+                    WRK.SYNC : sync
+                }
+
+                threads = [ ]
+                thread_count = options.thread_count if options.thread_count > 0 else multiprocessing.cpu_count( ) + options.thread_count
+
+                count = 0
+                total = 0
+                self.queue = multiprocessing.JoinableQueue( )
+
+                for i in range( thread_count ):
+                    t = Worker( c, self.queue, commands )
+                    t.daemon = True
+                    threads.append( t )
+                    t.start( )
+
+                if not options.quiet:
+                    c.writeflush( "  Done." )
+
+                make_drive_upper = True if os.name == 'nt' or sys.platform == 'cygwin' else False
+
+                command = "p4 fstat ..."
+
+                if not options.quiet:
+                    c.writeflush( " Checking files in depot, this may take some time for large depots..." )
+
+                proc = subprocess.Popen( command.split( ), stdout=subprocess.PIPE, stderr=subprocess.PIPE, cwd=directory )
+
+                clientFile_tag = "... clientFile "
+                headAction_tag = "... headAction "
+                headType_tag   = "... headType "
+
+                # http://www.perforce.com/perforce/r12.1/manuals/cmdref/fstat.html
+                accepted_actions = [ 'add', 'edit', 'branch', 'move/add', 'move\\add', 'integrate', 'import', 'archive' ] #currently not checked
+                rejected_actions = [ 'delete', 'move/delete', 'move\\delete', 'purge' ]
+                file_type_binary = 'binary+l'
+                file_type_text   = 'text'
+
+                client_file    = None
+                file_action    = None
+                file_type      = None
+                file_type_last = None
+
+                # todo: use fewer threads, increase bucket size and use p4 threading
+                class Bucket:
+                    def __init__(self, limit):
+                        self.queue = []
+                        self.queue_size = 0
+                        self.queue_limit = limit
+                    def append(self,obj):
+                        self.queue.append(obj)
+                        self.queue_size += 1
+                    def is_full(self):
+                        return self.queue_size >= self.queue_limit
+
+                self.buckets             = {}
+                self.buckets[file_type_text]   = Bucket(10)
+                self.buckets[file_type_binary] = Bucket(2)
+
+                def push_queued(bucket):
+                    if bucket.queue_size == 0:
+                        return
+                    if options.verbose:
+                        for f in bucket.queue:
+                            c.write( "  Checking " + os.path.relpath( f, directory ) )
+                    self.queue.put( ( WRK.SYNC, bucket.queue ) )
+                    bucket.queue = []
+                    bucket.queue_size = 0
+
+                for line in proc.stdout:
+                    line = get_str_from_process_stdout( line )
+
+                    #push work when finding out type
+                    if client_file and file_action is not None and line.startswith( headType_tag ):
+
+                        file_type = normpath( line[ len( headType_tag ) : ].strip( ) )
+                        if file_type == file_type_text:
+                            self.buckets[file_type_text].append(client_file)
+                        else:
+                            self.buckets[file_type_binary].append(client_file)
+                        count += 1
+
+                        #check sizes and push
+                        for b in self.buckets.values():
+                            if b.is_full():
+                                push_queued(b)
+                    
+                    elif client_file and line.startswith( headAction_tag ):
+                        file_action = normpath( line[ len( headAction_tag ) : ].strip( ) )
+                        if any(file_action == a for a in rejected_actions):
+                            file_action = None
+                        else:
+                            total += 1
+                            if os.path.exists( client_file ):
+                                file_action = None
+
+                    elif line.startswith( clientFile_tag ):
+                        client_file = line[ len( clientFile_tag ) : ].strip( )
+                        if make_drive_upper:
+                            drive, path = splitdrive( client_file )
+                            client_file = ''.join( [ drive.upper( ), path ] )
+
+                    elif len(line.rstrip()) == 0:
+                        client_file = None
+
+                for b in self.buckets.values():
+                    push_queued(b)
+                proc.wait( )
+
+                for line in proc.stderr:
+                    if "no such file" in line:
+                        continue
+                    #raise Exception(line)
+                    c.write(line)#log as error
+
+                if not options.quiet:
+                    c.writeflush( "  Done. Checked " + str(total) + " file(s)." )
+                    c.writeflush( " Queued " + str(count) + " file(s), now waiting for threads..." )
+
+                for i in range( thread_count ):
+                    self.queue.put( ( WRK.SHUTDOWN, None ) )
+
+                for t in threads:
+                    t.join( )
+
+                if not options.quiet:
+                    print( "  Done." )
+
+        if not options.quiet:
+            end = time.time()
+            delta = end - start
+            output = " Done. Finished in " + str(delta) + "s"
+            print( output )
+
+if __name__ == "__main__":
+    try:
+        P4SyncMissing().run(sys.argv)
+    except:
+        print( "\nUnexpected error!" )
+        traceback.print_exc( file = sys.stdout )
Author	SHA1	Message	Date
Brian Ernst	3aa1373758	Minor cleanup.	2020-08-12 15:05:50 -07:00
Brian Ernst	3adaf471c1	Updated readme.	2020-08-12 14:55:20 -07:00
Brian Ernst	7f3c4b1cb8	Updated time.clock to time.time, since former was deprecated.	2020-08-12 14:49:57 -07:00
leetNightshade	e5a84235cb	Fix issue where clientRoot is null due to multiple view mappings that don't share one root. TODO: should probably leave getClientRoot to return the "null". It's different than returning None.	2017-04-20 16:20:43 -07:00
unknown	972e9ca689	Fixed comparison issue, apparently had to make sure the number was an int. Stupid fucking error.	2015-06-09 10:15:00 -06:00
leetNightshade	55e5033794	Fixed error, forgot to comment out a line.	2015-06-08 21:20:28 -06:00
unknown	d5dc8155f5	Fixed up path limitation issue with p4.	2015-06-08 14:50:07 -06:00
unknown	a5f82d5e00	Neatened up output.	2015-05-13 12:12:36 -06:00
unknown	26d1127e64	Neatened up console output a little bit.	2015-05-13 12:10:58 -06:00
unknown	92d217371c	Fixed scripts up, improved logging so console has a waking thread now. Also fixed bug if console timer is too long it'll be killed off appropriately.	2015-05-13 12:06:54 -06:00
unknown	ea14f96d76	Fixed bug in p4SyncMissingFiles.py. Also fixed bug in p4Helper when running p4RemoveUnversioned.py.	2015-05-13 10:45:04 -06:00
unknown	c32c0bfbd1	Added bucketing based on file type (text/binary) and batching to reduce server calls.	2015-05-12 14:47:18 -06:00
unknown	49153babed	Fixed output bugs in p4SyncMissingFiles.py.	2015-02-18 15:28:58 -07:00
unknown	6610e8e357	Accidentally committed pyc. Will have to add .gitignore.	2015-02-18 15:14:18 -07:00
unknown	1d1d7f8cae	Split scripts up for now, may add all in one scripts later. Added p4SyncMissingFiles.py, so you don't have to do a force sync and redownload everything.	2015-02-18 15:09:57 -07:00
unknown	9d4d26250d	Added a fix if the specified directory isn't added to the repo but still in one, it'll be scanned and contents cleaned up. The only thing is, as of right now the folder itself won't be deleted, you'd have to run the script from a higher directory.	2015-01-14 17:58:34 -07:00
unknown	e7bb65874e	Added more debug info at the end for files processed, cleaned up formatting. At some point will strive to make the output more UNIX friendly, parsable. I also fixed a bug where the script would crash instead of setting the P4Client. I need to fix the script to use a `with` construct so if you terminate the program the P4Client is returned to what it was.	2014-10-22 12:05:57 -06:00
$U-ILLFONIC\bernst$ U-ILLFONIC\bernst	06b0cbe426	Adding huge improvements. There are still a few more to make to account for computers not setup correctly, but it's functional. Still has the occasional console hang bug. Now also prints out run time. There is one new minor bug, reverting back to the previously set client view.	2014-08-13 17:09:19 -06:00
Brian	6236ead338	Adding new worker run type	2014-05-14 19:09:46 -06:00
Brian	3ffdd76147	Added basic worker thread back in, and TODO comments for multi-threading this new script.	2014-05-13 20:45:55 -06:00
Brian	59e010d682	Added a warning note for large depots.	2014-05-13 20:33:11 -06:00
Brian	fd419089be	See description. Why does this have to be so short? Removed excess input of polling p4. Fixed quiet output. Added directory removal back in. Made the output a little nicer, added singular and plural strings, also added directory total output.	2014-05-13 20:18:16 -06:00
Brian	865eaa243d	Removed creation of NUL file, annoying to get rid of. Also changed error formatting a little.	2014-05-13 19:08:53 -06:00
Brian	4435a36bed	Made script obey quiet option. Added file and error count to print at end. Also made sure the error output gets piped and doesn't show up in console. However, we shouldn't ignore any error output, this should be accounted for and properly logged. So, this is a TODO.	2014-05-13 14:08:16 -06:00
Brian	0dcd14a73b	Working in Python 2.7.4 and Python 3.4.0, HOWEVER, Console isn't exiting correctly.	2014-05-09 19:11:23 -06:00
Brian	55a5e41b00	Improved the auto flushing, made it time and buffer size based. In case a specific directory was taking a while, I changed it to auto flush after a specified period of time. Right now autoflush is automatically disabled, you have to enable it when creating the console. TODO: I'll probably hook the console up to the stdout and stderr so you can use ordinary print statements, we'll see. This is desirable for easily hooking it into an existing module.	2014-05-09 17:36:49 -06:00
Brian	c175b21dcf	Grabs depot tree first hand to make looping through directory faster. The big catch right now is, this method is single threaded, I haven't made it multi-threaded yet, but it definitely looks like it can benefit from it.	2014-05-09 17:19:44 -06:00
Brian	8d425d6413	Catch except in file iteration so you can continue processing remaining files. The next changes will be ground shaking, a lot should be changing, performance should increase significantly.	2014-05-09 15:21:56 -06:00