list list...big lists

Moshe-A · ‎11-08-2013

Hi guys

i would like to consults with you about my current VLISP application. it's basiclly a database one

with a DCL handler

at first it reads a database files which it base on record lines of data that are written (by other modula) from

drawings like number of invalid use of layer names, layouts & blocks. it also contains some string id data

like dates, login names and drawing names.

this database files are saved per month and each day it added with more 80-100 record lines which at the end of the month brings me to 2000-2500 records lines.

i do read in memory these files into one big list and i do use (cons) function to speed things up but still

it takes alot of time to read-in and manipulate other tasks like removing records and sorting...

these actions is done while the dialog is open and autocad enters to hold state like you thing it stucked.

the output of this application is a bundle of autocad tables(excel) A4 sheets which are also takes alot of time

to create.

i would like to get some advises from you guy on how to deal with such application cause each month

these database files may grow.

thanks in advance

Moshe

hgasty1001 · ‎11-08-2013

Hi,

Without seeing the code it's almost imposible to say something, can you post the code? or a more detailed description of the process involved. Also if you really need speed, you can try .NET it's a lot faster (10 to 100 factor) than lisp, specially in db management and process.

Gaston Nunez

dgorsman · ‎11-08-2013

Do as much sorting, filtering, and searching as you can on the database side. Its what is built for.

On the list side, avoid unnecessary repetitions. Work out the test sequences so that you make an early exit of that loop if nothing else needs to be done.

----------------------------------
If you are going to fly by the seat of your pants, expect friction burns.
"I don't know" is the beginning of knowledge, not the end.

Moshe-A · ‎11-08-2013

Gaston,

i'm sorry but i can not post the code and for current time i even can not consider

re-writting it in other languages (i do not know NET)

basiclly i'm talking about creating a list by reading some fields from the file

and creating A List with (Cons) function instead of (Append) function

i would like to 'hear' more opinions from others

thanks

moshe

dbroad · ‎11-08-2013

It's difficult to make recommendations in the absence of much info.

Why does the application need to read data from earlier in the month?

How does it process that data? IOW trends, rating systems, ...

Are your "database" files just text files from outputting lisp data via write-line or print statements?

Are you reading it in with read-line statements?

Cons is much faster than append at building lists.

Adding a progress bar might assure users that AutoCAD is not locked up.

Architect, Registered NC, VA, SC, & GA.

Moshe-A · ‎11-08-2013

the database files is built on regular basis by other reactor modula that loads from acaddoc.lsp and runs on each opened drawing. it's two programs that runs separately. the second modula with the big list runs at command line

Yes the database file is a text file create with (write-line) and read-in back via (read-line)

i do use progress bar when i'm creating the tables but didn't thought to use it while the dialog is open

Moshe-A · ‎11-08-2013

the problem with progress bars (acet-ui-progress) and reading text files that you do not know in advance

the number of records

dbroad · ‎11-09-2013

Instead of reading/writing to a single file per month, you might try a file a day. After each file open, you could tick the progress bar.

Architect, Registered NC, VA, SC, & GA.

Moshe-A · ‎11-09-2013

And openning and closing 20-25 files per month would make it faster?! i'm not sure about that...

there is a need to work with more than one month together

thanks

martti.halminen · ‎11-11-2013

Just reading a text file into a list is not a major problem. On my computer reading a file of 100000 lines, about 60 characters per line, into a list from a network disk takes about 0.3 seconds.

The function i use for reading:

(defun file-to-list (filename / file result line)
  (if (findfile filename)
      (progn (setq file (open filename "r"))
             (while (setq line (read-line file))
               (setq result (cons line result)))
             (close file)
             (reverse result))
      (alert (strcat "File " filename " not found!"))))

So, unless you are doing it some totally different way, your problems are likely elsewhere.

As it is pretty impossible to guess what you are doing inefficiently without seeing the code (not necessarily the full production code, some simplified example would be sufficient), I can only offer generalities.

First of all, measure which parts of your program are taking the time. Typically in a program there are just a few bottlenecks, and not necessarily in the obvious places. So scatter some timestamp commands into your program to find the hot spots. (build some reporting around (getvar "CDATE"), for example)

Moving to another language doesn't help much, if the algorithm you are using is inefficient.

--

Moshe-A · ‎11-11-2013

Martti,

you right...just reading a text file and putting it on list is not the problem and my code

looks much the same as yours

but the records in the file contains field data that i have to put on different variables

and check for their validation. (you know we talking on text files saved on a network shared folder

and any user can open them even by mistake and change them)

the file can have a comment lines (by putting ';' char as first char) also i'm letting an empty lines

inside so all of that must be check before adding it to list

also the file may have a duplicate records so i have to eliminate those

each record has a date field and the program has a date range setting that the user

can set in order to eliminate out range records

so all that plus reading some files takes some seconds but to the user it looks like autocad is stuck

thank you very much for your advices

Moshe

martti.halminen · ‎11-12-2013

One idea worth checking would be doing the filtering with external tools. Unix-based text functions, in this case grep and uniq, are usually pretty fast, so you could find Windows versions of those and see whether they help any.

From your task list the removal of duplicates seems as the likeliest to cause performance problems, so that would probably be the first thing to check. Can you run the program with that disabled to measure the time required?

--

Moshe-A · ‎11-12-2013

@martti.halminen wrote:
One idea worth checking would be doing the filtering with external tools. Unix-based text functions, in this case grep and uniq, are usually pretty fast, so you could find Windows versions of those and see whether they help any.

what is that? how can we call unix-based text functions from vlisp?

From your task list the removal of duplicates seems as the likeliest to cause performance problems, so that would probably be the first thing to check. Can you run the program with that disabled to measure the time required?

as a matter of fact i do not remove duplicate records i only check to see if a record is exist? and do not add to list (i'm now realizing that i should do remove it in order to have the last one)

here is how i do find a duplicate record following that how i sort the list
; return T if record aleady exist in big list
(defun is_record_exist (new-record / item)
 (vl-some
   '(lambda (item)
     (and
      (eq (nth 0 new-record) (nth 0 item))
      (eq (nth 1 new-record) (nth 1 item))
      (=  (nth 2 new-record) (nth 2 item))
     )
    )
  big-list^
 )
)

   
; sort big list by name
(defun sort_by_name (/ r0 r1)
 (vl-sort
   big-list^
   '(lambda (r0 r1)
     (< (car r0) (car r1))
   )
 )
)
Moshe

martti.halminen · ‎11-13-2013

grep and uniq are just ordinary programs that were originally created on Unix systems, but there are also Windows implementations of those, so you would need to find one of those and install it. Calling those is done with startapp (or dos_execute if you have DOSlib), but it would probably be easier to build a script around them instead of trying to get the parameters correct from Autolisp.

Probably not worth the bother if you are not familiar with the system.

Your functions look reasonable, some slight performance improvements might happen in compiled code by using (function (lambda ... ) instead of '(lambda ...

If your records are in random order, vl-some is about best you can do to find duplicates. If the duplicates only occur close to each other, something that wouldn't go through the whole list every time might be faster.

Assuming you only sort the list once, the performance shouldn't be a problem. Have you checked what vl-sort does to duplicates in your list? (in some situations it drops duplicates, depending on the comparison function).

--

Moshe-A · ‎11-13-2013

Assuming you only sort the list once, the performance shouldn't be a problem. Have you checked what vl-sort does to duplicates in your list?

yes i only sort the list once

did not check what vl-sort does to duplicate items

(in some situations it drops duplicates, depending on the comparison function).

can you provide more details about that

moshe

martti.halminen · ‎11-14-2013

@Moshe-A wrote:
Assuming you only sort the list once, the performance shouldn't be a problem. Have you checked what vl-sort does to duplicates in your list?

yes i only sort the list once
did not check what vl-sort does to duplicate items

(in some situations it drops duplicates, depending on the comparison function).
can you provide more details about that

The documentation of VL-SORT only says: "Duplicate elements may be eliminated from the list."

On further testing, seems that it drops those components that are EQ to some other, regardless of the comparison function:

_$ (setq data '(1 1 1 2 2 1 1 2 3 4 1 2 34 3 4))
(1 1 1 2 2 1 1 2 3 4 1 2 34 3 4)

_$ (setq data2 '("1" "1" "1" "2" "2" "1" "1" "2" "3" "4" "1" "2" "34" "3" "4"))
("1" "1" "1" "2" "2" "1" "1" "2" "3" "4" "1" "2" "34" "3" "4")

_$ (vl-sort data '<)
(1 2 3 4 34)
_$ (vl-sort data2 '<)
("1" "1" "1" "1" "1" "1" "2" "2" "2" "2" "3" "3" "34" "4" "4")

- if we trick this with a faulty comparison function that doesn't produce a consistent ordering, we get even odder results: some duplicates are dropped, but not all:

_$ (vl-sort data '(lambda (a b) nil))
(1 2 1 3 4 1 2 34 3 4)
_$ (vl-sort data '(lambda (a b) t))
(4 3 34 2 1 4 3 2)

So for your purposes this is not a problem.

--

list list...big lists

list list...big lists

Forums Links

Post to forums