Index  Comments

I find myself at a loss of what to write lately and currently, as none of my newer work is currently in a finished or otherwise presentable state; I believe I'll make this month one of rebuttals and of articles that are in any case spurred by other articles I've read elsewhere.

This article was spurred by Beating C With 80 Lines Of Haskell: Wc.

I'm still green to Ada and implementing a trivial algorithm such as wc in any way is good practice I think. Follows is the program, which uses a naive approach and yet in my testing has shown to be of similar performance characteristics to the wc already present on the various POSIX systems I do use; note I've not intimately tested this nor have I truly attempted to optimize:

-- Wc - Count the characters, lines, and words from a file in Ada.
-- Copyright (C) 2019 Prince Trippy programmer@verisimilitudes.net .

-- This program is free software: you can redistribute it and/or modify it under the terms of the
-- GNU Affero General Public License version 3 as published by the Free Software Foundation

-- This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without
-- even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
-- See the GNU Affero General Public License for more details.

-- You should have received a copy of the GNU Affero General Public License along with this program.
-- If not, see <http://www.gnu.org/licenses/>.

with Ada.Text_IO, Ada.Characters.Handling, Ada.Command_Line, Ada.Sequential_IO;
use  Ada.Text_IO, Ada.Characters.Handling, Ada.Command_Line;

procedure Wc is
   package S is new Ada.Sequential_IO(Character);
   use S;

   procedure Word_Count (File : in S.File_Type; Characters, Lines, Words : out Natural) is
      Current : Character;
      C, L, W : Natural := 0;
      Inside  : Boolean := False;
   begin
      while not End_Of_File(File) loop
         Read(File, Current);
         C := C + 1;
         if Is_Line_Terminator(Current) then
            Inside := False;
            L := L + 1;
         elsif Is_Space(Current) then
            Inside := False;
         elsif not Inside and not Is_Control(Current) then
            Inside := True;
            W := W + 1;
         end if;
      end loop;
      Characters := C; Lines := L; Words := W;
   end Word_Count;

   File : S.File_Type;
   C, L, W : Natural;
begin
   Set_Output(Standard_Error);
   Set_Exit_Status(Failure);
   if Argument_Count /= 1 then
      Put_Line("You must specify one file.");
      return;
   end if;
   declare
      package I is new Ada.Text_IO.Integer_IO(Natural);
      use I;
      N : String := Argument(1);
   begin
      Open(File, Mode => In_File, Name => N);
      Word_Count(File, Characters => C, Lines => L, Words => W);
      Set_Output(Standard_Output);
      Put(L); Put(W); Put(C); New_Line;
      Flush;
      Close(File);
      Set_Exit_Status(Success);
   exception
      when   S.Data_Error => Put_Line("The file " & N & " contained invalid characters.");
      when S.Device_Error => Put_Line("A failure of the underlying device has occured.");
      when    S.End_Error => Put_Line("The file " & N & " was read beyond its end.");
      when   S.Name_Error => Put_Line("The file " & N & " does not exist.");
      when S.Status_Error => Put_Line("The file " & N & " was somehow already open.");
      when    S.Use_Error => Put_Line("The file " & N & " could not be opened for reading.");
   end;
exception
   when others => Put_Line(Standard_Error, "An unanticipated error has occured.");
end Wc;

Unlike the article which spurred this, I do make the claim that my language of choice is better than C. By my figuring, this program is not optimized in C firstly because it doesn't even warrant being a separate program and secondly because this problem is one which isn't solved by using an interface exposed to C easily through POSIX and kept from other languages by good taste.

This has been good Ada practice, I think, and I may rewrite this later to use a different package to perform I/O. The Sequential_IO was used because it represented the barest interface to that desired functionality which was comprehensive, as only sequential access is needed. The Text_IO was shunned due to getting in the way with its character collection routines. An obvious improvement would have a buffering approach of some sort for improved performance.

There are likely some differences between POSIX wc and this, involving how characters are treated as words, but I wanted to use Characters.Handling without complication. I'd rather argue the former is erroneous, in its treating of punctuation as ``words'', but it's largely irrelevant for the purpose. I didn't make use of Ada's extensive character facilities for this, but that's another quality I may purse at my leisure later.

I expect to have a Common Lisp version of this article tomorrow and I may revisit to add performance measurements and new implementations, but this will remain casual.