Dissociated Press

There is this nice Emacs game where you take some random text and use random walk to generate a somewhat similar but completely random text out of it. It is called “Dissociated Press“.

Out of sheer randomness, I’ve implemented this in Oracle.

Notes:

  • I’m using varchar to hold the text, which limit us to short and therefore uninteresting texts. CLOB would be better.
  • If there is any serious amount of text involved, you’ll want to create a non-unique index on the first two columns of the markov table after you populate the table, and gather statistics on that.
  • Feel free to criticize my code. That is why I post it. But note that I’ve broken some lines so it will look nice in the blog.
  • Also feel free to rewrite the entire thing with a single model statement.
  • The algorithm should be pretty straightforward, but you can read about a similar implementation here.
CREATE TABLE markov
       (c1 VARCHAR(30),
       c2 VARCHAR(30),
       c3 VARCHAR(30));

DECLARE
        cnt BINARY_INTEGER;
        i BINARY_INTEGER;
        my_str VARCHAR(4000) := 'A bug that has been documented.
To call something a feature sometimes means the author of the
program did not consider the particular case, and that the program
 responded in a way that was unexpected but not strictly incorrect.
 A standard joke is that a bug can be turned into a feature simply
 by documenting it (then theoretically no one can complain about
it because its in the manual), or even by simply declaring it to
be good. Thats not a bug, thats a feature is a common
catchphrase.';
        my_comma_str VARCHAR(4000);
        my_table dbms_utility.uncl_array;
        num_itr INTEGER := 50;
        tmp_c1 VARCHAR(30);
        tmp_c2 VARCHAR(30);
        tmp_c3 VARCHAR(30);
BEGIN
     select REPLACE('"' ||
	REPLACE(my_str,' ','","') || '"',',"",',',')
	INTO my_comma_str FROM dual;
     dbms_utility.comma_to_table(my_comma_str, cnt, my_table);
     FOR i in 3..cnt LOOP
         INSERT INTO markov (c1,c2,c3) VALUES
	(replace(my_table(i-2),'"',''),
	replace(my_table(i-1),'"',''),replace(my_table(i),'"',''));
     END LOOP;
     COMMIT;
     SELECT c1,c2,c3 INTO tmp_c1,tmp_c2,tmp_c3 FROM markov
     WHERE ROWNUM=1;
     DBMS_OUTPUT.put_line(tmp_c1 || ' ' || tmp_c2 || ' ');
     FOR i IN 1..num_itr loop

     	 SELECT c1,c2,c3  INTO tmp_c1,tmp_c2,tmp_c3 from
		  (SELECT * FROM markov WHERE c1=tmp_c2 AND
		  c2=tmp_c3 ORDER BY dbms_random.VALUE) t
		  WHERE ROWNUM=1;
		  DBMS_OUTPUT.put_line(tmp_c2 || ' ');
	 END LOOP;

END;

**
This post has been partially inspired by this one.



Leave a comment