Tuesday, July 14, 2020

Class Module clStrings

Class Module clStrings

Download from VBForums
Download from ME

This class module is another in a series of library modules that are designed to work with Visual Basic 6 (VB6) and all versions of Visual Basic for Applications (VBA). The code runs equally well in any of these environments which I will refer to as “VB” instead of VB6 or VBA or VB6/VBA. All routines including those dealing with Windows API calls or file reads/writes are Unicode all the time.

This module has a large set of routines to work with “BigString” which is up to 1,000 times faster than VB’s string functions in some cases. In addition there is a complete set of text and binary file read/write functions that fill a large void in VB’s file handling capabilities, especially for text files.

Overview

This class module contains an enhanced version of the BigString routines (sometimes call StringBuilder) that greatly speed up VB’s string handling especially when dealing with concatenating a large number of strings. This module also has a simple-to-use but comprehensive system for reading/writing text files and binary files (I have done up to 500 MB files in one read). An obvious question is why combine these in one module? When we read a text file it is much more efficient to read from a disk in one read into a large memory buffer and then make individual lines of text out of it. A BigString is very convenient for this. Also, when we write strings to a text file it is more efficient to do the physical write in one call which means the entire set of strings needs to be in one large data buffer, very much like a BigString. Thus it makes sense to me to combine them. I have had earlier versions of these as separate modules but I was almost always using them together so I combined and optimized them.

The BigString set of routines enable you to greatly speed up string operations when there are many changes and concatenations and/or when string lengths exceed a few hundred characters. Normally, VB re-allocates the entire string any time there is any operation on the string to change it. When you use this module, a large string is allocated once and then the subsequent string operations occur within the large string. The difference can be speed increases of up to 1,000 times versus standard string handling using built-in VB functions. If you are dealing with 100 strings o les then it likely is a bit quicker to use VB’s built-in procedures but the BigString concept excels when dealing with thousands of strings.

The file Read/Write functions are as fast as anything you can do in any language. As a VB programmer you are likely painfully aware that even though the language deals with Unicode strings, when text files are read or written, all of the Unicode gets converted to ANSI, causing all sorts of problems. This module totally eliminates that problem. Most of what we do is via Windows API calls (all Unicode) so there is no inherent slowdown due to using Visual Basic. We can read or write text files in UTF-8 (today’s text standard used almost exclusively now for web pages since it efficiently handles all Unicode characters), UTF-16 which by convention has a BOM, ANSI (Default, OEM, CurrentThread or whatever code page you want to use), and UTF-8 with a BOM even though this is discouraged now. You can specify the file type or the routine can auto-detect which one it is.

By the way, a BOM is a Byte Order Mark which is 2 or three bytes of data sometimes used for a UTF-8 file and always for a UTF-16 file to more or less announce what type of format the text will be in. UTF-8 has become so widely used that the use of a BOM is discouraged. If you wish to read more about the different types of text files exist in the Windows world, see any of the following links: UTF-8UTF-16ANSI Code Pages (mid-article for Windows code pages) and here for a Microsoft discussion on the various Windows code pages you can use (if you really need to) when you convert to/from ANSI/Unicode.

I have recently incorporated the ability to read and write binary files. It doesn’t fit with the rest of the module being related to strings but it required very little code beyond that necessary for the text read/write routines so I incorporated the features in this module.

This class module contains many string handling functions to address some VB shortcomings and to extend their capabilities. This module works in total Unicode and works with VB6 and all 32 and 64-bit flavors of VBA. All of the calls to clStrings require regular module mUCCore in order to function. This module contains many string functions on its own that you likely will find useful in addition to a whole host of routines I use every day including file operations, error handling, the operating system and so forth. Below is a list of string-related functions included in both the class module clStrings and module mUCCore.

BigString – String operations are very slow in VB6/VBA because every little change requires the string(s) to be totally reallocated. For a few characters this isn’t bad but it gets very bad when dealing with long strings and files. There is a whole subsystem described later that works around all of this, providing an alternate system to append, insert, search, remove, etc. strings at very high speeds.

DelimGet or set the delimiter string (initially set to vbCrLf). Must be 1 or 2 characters. Defaults to vbCrLf which is standard for Windows text files.
AppendAdd a string onto the end of bigString. Optionally set the starting character in the string to append.
AppendWDelimAdd a string and the Delimiter (initially vbCrLf)
InsertInsert a string into the big string. Tell it what character position to insert ahead of, specify a string & optional start character in that string and whether or not to put the delimiter sequence onto the end of the inserted string.
InsertWDelimSame as Insert above but with the current Delimiter tacked on to the string to insert.
LengthReturn the current length of the string being built (same as normal “Len()”)
RemoveRemove a specified # of characters from a place in the string.
SplitLike normal Split but operates on our bigString. The delimiter is whatever has been set with Delim (default vbCrLf). Specify start/end character positions, Limit sets the number of split strings found. Compare sets how text is searched.
FindFind a sub-string in the big string (equivalent to normal string’s InStr). Specify the string to find, what character to start looking and the compare method.
CapacityReturn current max length of the string with the current “chunks” (it will auto grown for more data)
ChunkSizeGet or set the Unicode character chunk size. The default value is 32,768 characters (65,536 bytes).
GetAStringReturns part or all of the big string. Specify the start and stop character (default to the whole string in the BigString).
bigStringSet the value of the big string starting to be built (to erase set it to "").
GrowWithGarbageLengthen our internal string by a specified # of characters. Useful for later dropping in data from an API call etc. (Advanced).
AppendPtrDataQuicker append using pointer to string and how many characters to append. (Advanced)
InsertPtrDataInsert a string using a pointer to the string (Advanced).
HeapMinimizeShrink the allocated memory for bigString down to a minimum (can still grow after this).
GetToIntCharsCopy part or all of the big string to an integer array.


Below are string functions found in the standard module mUCCore.

SubstStr – Substitute environment variables, drive label convert to drive letters, current time and date into a string.

StringW – Unicode replacement for VB function String$ which only uses chars 1-255.

iPad – Left and right-justify 2 strings over a given width. Good especially for tabular output to the Immediate Window since in both VBA and VB6 it is monospaced (all characters are the same width). I use it a lot for debugging.

AllocString – Makes a string containing a certain number of characters. Faster than Space$ for strings longer than about 400 characters. Not of much use by itself but it does provide a nice buffer for return string buffers from Windows API calls.

For those of you who dive deeper into programming than normal VB6/VBA coding, the following are Public procedures that deal with strings and text using pointers (although as we all know, nobody using VB6/VBA knows anything about pointers…). These procedures are extremely useful especially when dealing with many Windows API calls. If you don’t use pointers and memory buffers then you can ignore the below procedures. They are used internally in many of my other procedures but almost all of them take in and return normal VB variables and do not require any knowledge of pointers.

Ptr2VBStr – Makes a string in VBA and copy the data in memory to that string. You can specify the number of characters or have it find the end of the string (marked by a null character).

Ptr2Str – Even faster than Ptr2VBStr using a different algorithm. The function determines the string length.

lstrlenW – Find the length of a string in memory (characters followed by the null character).

RTLMoveMemory - Not just a string function. Copy memory data from one location to another.



File Functions

ReadTextFile – Read a text file into string array or BigString. File encoding can be UTF-8, UTF-16 or ANSI.

WriteTextFile – write a text files from string array or BigString to a file encoded in UTF-8, UTF-16 or ANSI.

ReadBinaryFile – Read a binary file into a byte array.

WriteBinaryFile – Write binary (non-text) data from a Buy buffer to a file.

SetFilePtr – Set then return the read/write file pointer in the open file. In 32-bit code the position is held in a Currency data type and in 64-bit code it the position is in a 64-bit LongPtr. Both use the Windows API function SetFilePointerEx.

CloseOpenHandle – Close the file handle for our read/write functions (if the file has been left open).


Setup and Use

The class module clStrings requires only that module mUCCore is included in the program. If you want to run the code in Excel or VB6 you need do nothing other than use the code. Just insert the class module clStrings and the standard module mUCCore into a new or existing VB6 or VBA project and you are ready to go.

If you are using this module for Office programs other than Excel, you must set an appropriate conditional compilation constant for your VBA project. There is no built-in way to distinguish between the Office programs at compile time so to do that we need to set our own conditional compilation constants which we use to check here. If you plan to use this in some code for Word, go to Tools | VBAProject Properties (2nd one from bottom) and in the General tab sheet, enter the value "Word = 1" (without quotes; case doesn't matter) to set the conditional compilation variable Word to 1. Do similar things in VBA projects you want to run in Access (Access = 1), PowerPoint (PowerPoint = 1) and Outlook (Outlook = 1). Excel and VB6 do not need a compilation constant because we can distinguish between VB6 and all of the VBA versions and we assume that you are using Excel unless modified above because most people who use VBA are using it in Excel. We can also automatically distinguish between 32 and 64-bit VBA code so you don’t need to do anything special for that.

HostRequired Conditional Compilation Constant
Visual Basic 6N/A
MS ExcelN/A
MS WordWord = 1
MS AccessAccess = 1
MS PowerPointPowerPoint = 1
MS OutlookOutlook = 1

The reason for the distinction in VBA hosts is that there are commands that exist in one host but not in the other. For example, in VBA our code is held within individual documents and often we want to know what document is holding/running our code. In Excel this is ThisWorkbook.Path but in Word it is ActiveDocument.Path, in PowerPoint it is ActivePresentation.Path, in Access it is CurrentProject.Path and in Outlook there is no equivalent. If I have a line of code that uses thisWorkbook.Path it will compile and run fine in Excel but it won’t even compile in any of those other hosts. In this particular case, I created a variable called AppPath that holds this path and I have code blocked out for each of the possible hosts so that they don’t “see” the statements that don’t exist in their version of VBA. It was a bit of a pain to set that up originally but once set up it works very well.


Use

As with all class libraries, you must set a reference to the module before you can use it.

Dim Strs As clStrings

It doesn’t need to be named “Strs”. Then somewhere in your code you put the line

Set Strs = New clStrings

Initialization code sets the size of the big string for the string builder functions to a size of 32,769 characters (it actually doesn’t use any memory until you assign a string to it) and it also calls UCCoreInit in the module mUCCore if it hasn’t already been called by another routine.


When you are finished using the class module set it to Nothing.

Set Strs = Nothing

VB6 Users – There are controls that have been developed on the VBForums website by Krool which are enhanced versions of those Microsoft supplied with Visual Basic. Here is a link to his Common Controls Replacement Project. These controls enable Unicode and many other things. I highly recommend them. If you use them you must start your program with Sub Main and not a form and there is a bit of initialization code required to use a newer version of one of the Windows DLLs. That code is here in UCCoreInit so I recommend starting your programs with Sub Main and making the first line of code in that sub be a call to UCCoreInit. If you aren’t a VB6 user or none of this makes sense to you just skip it.



'========================================================================================
' Control Resizer Class Module
'
' This class module can be "attached" to a form and when the form is resized or maximized
' all of the controls on the form are resized and optionally all of the text associated
' with each control is resized as well.
'----------------------------------------------------------------------------------------
'Version history
' v1.0.0 28 Nov 2016 Initial release by Paul Grimmer (PJG) (email @ paul@grimmerfam.com)
' v1.1.0  5 Dec 2016
'    Incorporated Form_Resize into this class module using "WithEvents"
'    Ensured the first display is on an actual screen
'    This documentation was updated from earlier beta versions
' v2.0.0 11 Jul 2017
'    Frm is no longer declared as WithEvents so that we have control in the form code of resizing
'     i.e., the from's resize code is triggered and run and it does not automatically jump to the
'     code in this class module. Programmer has the choice to do that in the form resize
'     routine but it doesn't just bypass the form and come here automatically.
' v2.1.0 30 Aug 2017
'    Added SetFormPos to enable the form to be put on top (or not) at any time
' v2.1.001 23 Apr 2018 adding MDI child capabilities
'----------------------------------------------------------------------------------------
' Dependencies - None
'----------------------------------------------------------------------------------------
' To set up to use this class module with a VB6 Form, do the following to the Form:

' Set the following Form properties
'    BorderStyle = 2 (Resizable)
'    Optionally set MinButton to True and MaxButton to True

' In the declaration section at the top of the Form put
'Dim frmResize As New ControlResizer
'Public UniCaption As String

' In the Form_Load procedure start with this line after your code for setting up the
' controls on your form put the following line:
'frmResize.NewSetup Me

' That's all you need. An instance of this class is created when the Form is loaded. When
'  the Form is shown onscreen a Resize event is called and we catch the original state
'  of the form position.
' When the form is Unloaded this instance of the class module goes away too.
'
'----------------------------------------------------------------------------------------
' Controlling the resizing
'
'The form can be resized either by you in code or by the user who can drag the edges of the
' form to resize &/or maximize the form.
'
'There are four variables you can set to control the resizing behavior for each form (each
' is separate. After your form is loaded you can set these variables. The efault for each
' is True.
'
'ResizeActive - True enables form resizing by the user.
'CanResizeFonts - True makes control font sizes change as control sizes change.
'KeepRatio - True keeps the height/width ratio the same as the starting form's
'  height/width ratio as the user resizes the form.
'Zoomable - True changes the size of each of the controls on the form as the
'  size of the form itself changes.
'
'You can set whether individual controls react to resizing by using the "Tag" property
' of the control ("Tag" is available for each control for you to put whatever text you want
' in the property at design-time.) Tag contents (if any) are totally up to the programmer.
' My code looks for the string "Skip" (not case sensitive) at the start of the tage for
'  each control. If that string is found then that control will not be resized or moved as
'  the rest of the form is moved and/or resized.
'========================================================================================

and

' Public Functions in this class module

' ReadTextFile - Read any text file into string array or BigString in UTF-8, ANSI or UTF-16 (UCS-2)
' WriteTextFile - Write any text file from string array or BigString to UTF-8, UTF-16 (UCS-2) or ANSI file
' ReadBinaryFile - Read art or all of a binary file into a byte array
' WriteBinaryFile - write part or all of a byte array to a binary file
' SetFilePtr -  Sets the file pointer in the open file from ReadTextFile (if still open)
' CloseOpenHandle - If the Read/Write handle has been left open this closes it.

' BigString (based on old StringBuilder code)
'  Get Length - Return the current length of the string being built (chars)
'  Get Capacity - Return current max length of the string (auto grown
'    for more data) (chars)
'  Get/Let ChunkSize - Return or set the the unicode character chunk size (chars)
'  Get/Let Delim - Return or set the end-of-line delimiter sequence (1 or 2 characters)
'    Iniitally is set to vbCrLf, but could be vbCr or vbLf or any other 1 or 2 character string for end-of-line

'  GetAString - Returns part or all of the built string
'  Let bigString - Set the value of the string starting to be built (can be "")
'  Sub Append - Add a string onto the end of bigString
'  Sub AppendWDelim - Append and add a delimiter sequence to the end
'  Sub AppendPtrData - Quicker append using pointer to string
'  Sub GrowWithGarbage - Lengthen our internal string by a specified # of characters
'  Sub Insert - Insert a string into the big string
'  Sub InsertWDelim - Insert a string (with delimiter) into the big string
'  Sub InsertPtrData - Insert a string using a pointer to the string
'  Sub Remove - Remove a specified # of characters from a place in the string
'  Function Find - Find a sub-string in the big string (like VB InStr)
'  Sub Split - Split part or all of BigString into sub-strings (like VB Split)
'  Sub HeapMinimize - shrink the allocated memory for bigString down to a
'    minimum (can still grow after this)
'  Sub GetToIntChars - Copy part or all of the big string to an integer array
'----------------------------------------------------------
' v2.2.4 30 Jul 2018
' ======================================================================================

No comments:

Post a Comment