My turn:
"""
HexDump program in Cobra.
Produces output like hexdump -C
"""
class HexDump
def main
args = CobraCore.commandLineArgs
if args.count <> 2
binary_name = Path.getFileName(args[0])
print 'Usage: [binary_name] \[filename\]'
return
.processFileNamed(args[1], 16)
def processFileNamed(filename as String, width as int)
require
width % 2 == 0 # width must be even
test
pass # to-do
body
halfWidth = width // 2
hexPart = asciiPart = ''
for n, byte in FileByteStream(filename).read.toList.numbered
if (n % width) == 0 # file offset in hex
line = '[n:x8]'
hexPart += '[line] '
hex = '[byte:x2]'
hexPart += '[hex] ' # accumulate hex chars space separated
asc = '.'
if 0x1f < byte < 0xff
asc = Convert.toChar(byte).toString
asciiPart += asc
if (n+1) % width == 0 # display a full line
print '[hexPart] |[asciiPart]|'
hexPart = ''
asciiPart = ''
else if (n+1) % halfWidth == 0 # extra space in middle of hex
hexPart += ' '
n += 1
if n % width
# space pad both parts of partially filled last line
hpad = 3*(width - (n % width)) + if(0 < (n % width) <= halfWidth, 1, 0)
hexPart += ' '.padLeft(hpad)
apad = width - (n % width)
asciiPart += ' '.padLeft(apad)
print '[hexPart] |[asciiPart]|'
print '[n:x8]' # file byte count
class FileByteStream
var _input as FileStream
cue init(filename as String)
base.init
try
_input = File.openRead(filename)
catch ioe as IOException
print 'I/O Error: [ioe.message]'
def read as int*
while true
current = _input.readByte
if current == -1, break
yield current
yield break
-- When I compile the original code, there is this warning:
hexdump1.cobra(6): warning: The value of variable "binary_name" is never used.
Which ties back to hopscc's comment about the use of the "r" prefix in a string with substitution. I just wanted to point out that it's good to heed warnings due to their utility.
-- This code:
binary_name = args[0].split(c'/').toList.get(-1)
should be:
binary_name = Path.getFileName(args[0])
Now it's cross platform, easier to read and less error prone.
-- I agree with hopscc comment about splitting main.
-- I don't agree with favoring "is shared" as hopscc does. I still see shared/static overused in real project code leading to well known problems including the inability to override methods in subclasses, increased difficulty in testing, and a subsequent bias to make shared/static variables which are effectively global variables which have their own well known problems. I don't use shared/static unless I have a "damn good reason" to. Consider "shared" methods to be guilty until proven innocent.
I might post a longer article about this at some point with more examples and explanation.
-- Since the program only processes one argument, I changed the check "if args.count < 2" to "if args.count <> 2". Of course, another approach would be to dump all arguments passed.
-- I see a lot of "16" and "8" in the program. I factored these out. In the future, maybe the 16 could even be a command line argument.
-- I changed the name of "processFile" to "processFileNamed". When a method takes the name of something rather than an object of that something, then I prefer to put "Named" in the method. If I was passing a file object then I would call it "processFile". This is a pure style preference, of course.
-- I thought the entire "def numbered" method could be removed since we have it in the Cobra std lib. So then change this:
for n, byte in FileByteStream(filename).numbered
to this:
for n, byte in FileByteStream(filename).read.numbered
but it turns out that utility method is only on IList, not IEnumerable. So for now, I have a .toList in there:
for n, byte in FileByteStream(filename).read.toList.numbered
when the std lib supports .numbered on anything enumerable, the .toList can be removed.
-- Instead of:
String.format('{0:x8}', n)
I prefer:
-- The .read method repeats a statement. Although it's not a lot of code repetition, I prefer the version that does not, seen above.
-- I don't like "random casing":
# file
Ofset in hex
# accumulate hex chars space separated
#
Display a full line
# file byte
Count
There is no rhyme or reason here.
-- FileByteStream has:
which is untyped. It works okay, but the code will be faster (and you will get more error checking) if it is typed:
Another benefit is that reading the code 6 months later can be easier.
-- You'll notice that I stubbed out a potential test case, but left it blank for now. I will post separately on that.
-- BTW, no matter whose version I run, I get inconsistent widths in lines like so:
- Code: Select all
000000d0 00 80 00 00 00 02 00 00 00 00 00 00 03 00 00 00 |...............|
000000e0 00 00 10 00 00 10 00 00 00 00 10 00 00 10 00 00 |................|
Interestingly, when I pasted the output from Mac Terminal to the TextMate editor, a non-ASCII character was revealed. I guess Terminal won't print this character. I tried "hexdump -C" and it does not have this problem:
- Code: Select all
000000d0 00 80 00 00 00 02 00 00 00 00 00 00 03 00 00 00 |................|
000000e0 00 00 10 00 00 10 00 00 00 00 10 00 00 10 00 00 |................|
Nor does it ever display high bit characters. Also "hexdump -C" will print an asteriks on a line by itself on occasion, but I haven't researched what that's about.
-- For a small utility, I might not take things this far, but this is illustrative of important principles for the larger programs that are harder to develop and maintain. Thanks for stimulating this discussion.