Blog Home  Home Feed your aggregator (RSS 2.0)  
Venexus DotNetNuke Blog - Thursday, May 31, 2007
DotNetNuke Articles, Code Snippets, Errors, and News
 
 Saturday, May 26, 2007

We had a DNN 4.4.1 to 4.5.1 upgrade that threw a few SQLDataProvider errors during the installer that were caused by duplicate files in DNN Files table. Not sure how they got there, but here is a SQL statement to check if they exist:

SELECT PortalID, Filename, Folder, COUNT(Filename) AS NumOccurrences

FROM files

GROUP BY Filename, PortalID, Folder

HAVING ( COUNT(Filename) > 1 AND Count(PortalID) > 1 AND Count(Folder) > 1)

Here is the SQL that was provided in the log file from the installer:

/* add unique constraint to Files table */
IF NOT EXISTS (select * from dbo.sysobjects where id = object_id(N'dbo.[IX_FileName]') and OBJECTPROPERTY(id, N'IsConstraint') = 1)
BEGIN
  declare @FolderID int
  declare @FileName nvarchar(100)
  declare @FileID int
  declare @MinFileID int

  select @FolderID = min(FolderID)
  from Folders
  while @FolderID is not null
  begin 
    /* check for duplicate Filenames */
    select @FileName = null
    select @FileName = FileName
    from Files
    where FolderID = @FolderID
    group by FileName
    having COUNT(*) > 1
 
    /* if duplicates exist */
    if @FileName is not null
    begin
      /* iterate through the duplicates */
      select @FileID = min(FileID)
      from Files
      where FolderID = @FolderID
      and FileName = @FileName

      /* save min FileID */
      select @MinFileID = @FileID

      while @FileID is not null
      begin
        if @FileID <> @MinFileID
        begin
          /* remove duplicate file */
          delete
          from Files
          where FileID = @FileID
        end

        select @FileID = min(FileID)
        from Files
        where FolderID = @FolderID
        and FileName = @FileName
        and FileID > @FileID
      end
    end

    select @FolderID = min(FolderID)
    from Folders
    where FolderID > @FolderID
  end
  
  ALTER TABLE dbo.Files ADD CONSTRAINT
    IX_FileName UNIQUE NONCLUSTERED
    (
      FolderID,
      FileName
    ) ON [PRIMARY]
END

 

Saturday, May 26, 2007 12:48:57 PM (US Eastern Standard Time, UTC-05:00)  #       |   | 
 Wednesday, May 16, 2007

We have immediate openings for DotNetNuke Module Developers.

Title: DotNetNuke Module Developer
 
Skills Required:
  1. VB.Net or C#
  2. SQL Server
  3. Visual Studio 2005
  4. Code Generation Techniques
  5. DotNetNuke
 
Skills Desired
  1. EntitySpaces
  2. Gemini
  3. Subversion
  4. CruiseControl.Net
  5. VB.Net AND C# (Ability to read and code in both)
 
Location:
We would prefer to find a local candidate (Raleigh, NC), but if you have the skills, it does not matter where you live.
 
Description:
This position will be responsible for assisting in the analysis, design, development and ongoing support of DotNetNuke and the modules we create and modify for our clients. This position will assist with verification testing, troubleshooting and failure analysis of new versions of DotNetNuke, core modules, 3rd party modules, and custom modules. The developer must be able to commit and meet deadlines.
 
The person filling this position will be working in a team environment and may be expected to have on-call responsibilities. The candidate should have excellent verbal and written communications skills with a positive customer support attitude. A person who is flexible and self-motivated will be the selected candidate.
 
Additional Requirements:
  1. You must love code.

         If you didn’t love it, you wouldn’t be doing it, right?

  1. The weak shall not apply.
         Please do not waste our time if you do not know how to code or have never used DotNetNuke.
 

Request for more information and resumes can be submitted to careers (at) venexus (dot) com. No recruiters please!

Wednesday, May 16, 2007 3:25:00 PM (US Eastern Standard Time, UTC-05:00)  #       |   | 
 Sunday, April 29, 2007

If you didn't catch it, DotNetNuke announced the first officially sponsored DotNetNuke conference in Las Vegas November 5-8, 2007. Last year was the first time I had been to Vegas, when we went to a modular software development conference last September. It was fun meeting some of the guys in the DNN community and I look forward to meeting them and others again this year. According to Joe Brinkman's blog, DotNetNuke is actively seeking speakers to submit session proposals and selected speakers will receive "3 nights of lodging at the Mandalay Bay Casino and Hotel". While I probably will not be giving a speech, Michael, Scott and I have already marked our calendars, and we may have to twist the arm of a couple more staff to go with us, depending on development schedules. Sounds like fun, see ya there!

Sunday, April 29, 2007 2:40:03 PM (US Eastern Standard Time, UTC-05:00)  #       | 
 Friday, April 20, 2007

Continuous integration is a software development term describing the process that completely rebuilds and tests applications frequently. We recently implemented a continuous integration environment to “publish” our DotNetNuke modules to our development/staging DotNetNuke sites.

The main advantages of a continuous integration environment are:

  • Issues are detected and fixed continuously!
  • Enhancements and new features are published continuously!
  • You are warned about problematic code before it is published.
  • Immediate unit testing of all changes.
  • Constant availability of a "current" build for testing, demos, or releases.
  • Bragging rights for developers who have the least number of broken builds.
  • Huge conservation of time when considering the normal administrative process of Build > Package > install in DNN > Test.
  • Did I mention this is continuous?

How Our CI and DNN Environment Works

Below is a picture to show you the basics of how our continuous integration environment works.

Disclaimer:

  1. If you are running Visual Studio 2005 Web Developer Express, our setup will not work for you. You can stop reading here, or upgrade, unless you are just curious, then read on…
  2. There have been long discussion on Web Site Projects (WSP) versus Web Application Projects (WAP) and this post is not one to argue about which is better, rather than to say this is what we do, and the basics of how it works.
  3. You are free to comment and collaborate with others on this post. Feel free to even argue about WAP versus WEP, or that you may know of a way to integrate CI with DNN and WSP, we really do not care. However, we do not have time to walk you through setting any of this up, so please do not ask…unless you are interested in one of our DNN support packages, then by all means we can help ;-)
     

Basics of the process:

  1. A developer “commits” the DNN module code to Subversion.
  2. The commit triggers CruiseControl.Net to “build” the DNN module using a “Trigger” for the project.
  3. CruiseControl.Net can be configured to unit test the module before publishing.
  4. An “ExecutableTask” is used with our custom assembly publisher application to send the .DLL file to the DNN website.
  5. A “BuildPublisher” sends the code from the source directory to the DNN site (D:\DNNSites\ClientDevSite\DesktopModules\CustomDNNModule as example).
  6. Results of “build” are visible in the CruiseControl.Net Web Dashboard.
     

Implementing CI for DNN Development

Development:

We started DotNetnuke 4 development using the WSP methodology for all of our DNN projects. This has been successful for us for quite a while, especially when using EntitySpaces for the Persistence Layer and Business Objects (it's so easy using ES to generate the DAL. You must check this out if you are not using it). However, we found that to make this work we have to use WAP projects. I have a very fast laptop, but using WSP and doing a build of all of DNN to compile a module can be quite time consuming, taking several minutes sometimes. But, building a WAP project is FAST, saving some development time when debugging and testing builds, especially when doing those final little tweaks. One could argue that WSP is better, but for our setup with one to many developers working on a single module, WAP is the best decision. So, it did not take much to twist our arm to changing our methodology. It is unfortunate that in order to use this for our existing clients and projects, we will need to convert our WSP modules to WAP. But, we have started developing all new modules as WAPs and converting WSP modules to WAP is not a difficult task.

To read about the great WSP versus WAP debate, see the following links:

Shaun Walker’s post on WAP

http://www.dotnetnuke.com/Community/Blogs/tabid/825/EntryID/434/Default.aspx

An interesting debate between Michael Washington and Vladan Strigo

http://www.dotnetnuke.com/Projects/ModuleNews/Forums/tabid/953/forumid/111/threadid/91268/threadpage/6/scope/posts/Default.aspx

WAP Methodology

“Web Application Projects provide a companion web project model that can be used as an alternative to the built-in Web Site Project in Visual Studio 2005. This new model is ideal for web site developers who are converting a Visual Studio .Net 2003 web project to Visual Studio 2005. (Released May 8, 2006)” - http://msdn2.microsoft.com/en-us/asp.net/aa336618.aspx

Michael Washington has a great post on creating a DNN WAP Module:

http://www.adefwebserver.com/DotNetNukeHELP/DNN4_WAP/

Once the module is ready for testing on our client development site, we use TortoiseSVN to “commit” code to Subversion (http://tortoisesvn.tigris.org/).

Source Code Version Control:

We use Subversion for our source code repository and version control system (http://subversion.tigris.org/). We have tried Visual SourceSafe and CVS, but have been using Subversion with success for quite some time now. CruiseControl.Net integrates with Subversion easily. It is also nice that there is a plugin for Subversion that allows us to send our comments directly to Gemini (project management/tracking application) when we commit new code.

Continuous Integration Software:

We use CruiseControl.Net, a .Net port of the Java based CruiseControl 

Continuous Integration Server using CruiseControl.Net

The CruiseControl.Net Server automates the integration process by monitoring the team's source control repository directly. Every time a developer commits a new set of modifications for the DNN module, the server will automatically launch an integration build to validate the changes. When the build is complete, the server notifies the developer whether the changes that they committed integrated successfully or not.

CruiseControl.Net allows for several different types of “Tasks”, such as:

  • EmailPublisher (for emailing of build details)
  • ExecutableTask (for kicking off executables, such as our custom assembly publisher)
  • NAntTask (for unit testing)
  • NUnitTask (for unit testing)
  • RSSBuildsPublisher (for generating a RSS feed with details)
  • VisualStudioTask (for running something in VS)
  • Etc.

Here is an example ccnet.config:
<!--<ccnetconfig><configurationVersion>1.2.1</configurationVersion></ccnetconfig>-->
<cruisecontrol>
  <project name="BPLWantList">
    <workingDirectory>D:\cibuilds\ProjectName\ModuleName</workingDirectory>
    <webURL>http://ourCIdomain.com/server/local/project/ModuleName/ViewProjectReport.aspx</webURL>
    <sourcecontrol type="svn">
      <trunkUrl>svn://localhost/ProjectName/ModuleName</trunkUrl>
    </sourcecontrol>
    <triggers>
      <intervalTrigger name="Quarter Hour Build" seconds="900" />
    </triggers>
    <tasks>
  <exec>
      <executable>VenexusAssemblyPublisher.exe</executable>
      <baseDirectory>D:\</baseDirectory>
      <buildArgs>"d:\Source\ProjectName\DesktopModules\ModuleName\obj\Debug" "d:\DevSites\ProjectName\bin"</buildArgs>
  </exec>
    </tasks>
    <publishers>
      <buildpublisher>
        <sourceDir>D:\cibuilds\ProjectName\ModuleName</sourceDir>
        <publishDir>D:\DevSites\ProjectName\DesktopModules\ModuleName</publishDir>
  <useLabelSubDirectory>false</useLabelSubDirectory>
      </buildpublisher>
      <xmllogger />
      <statistics />
    </publishers>
  </project>
</cruisecontrol>


CruiseControl.Net Web Dashboard

The CruiseControl.Net Web Dashboard Application is used for reporting a wide range of information about the builds. At one end of the scale it reports summary details of all projects in your organisation and at the other it can give specific metric output for any specific build.

Here is an example of a simple DNN Module being used in our CI environment:

Notice the failed build notification. While we have not setup unit testing, you can see in the left menu in the image, there are quite a few different options.

Conclusion

For long term DotNetNuke module projects, setting up a continuous integration environment will save a tremendous amount of time in the long run. All of the tools to implement CI are free, lowering the total cost of DNN module development. With a little bit of time setting up your environment, you can provide continuous updates to your clients, all while forcing good coding practices among your developers. 
 

Friday, April 20, 2007 1:47:50 AM (US Eastern Standard Time, UTC-05:00)  #       |  |   | 
 Sunday, April 08, 2007

 I have been asked to compare the differences between our search engine and Open-SearchEngine. I agree this is an important question that needs to be answered, so I decided to put together a comparison between the core DNN Search, Open-SearchEngine, and Venexus Search Engine. While my opinion of which is the best, is defintely biased toward our own product, I have tried to provide an in-depth look at the basics of how each search engine works, a feature matrix, and simple search results analysis. Without further ado, read on...

DotNetNuke Search (core project)
DNN Search is part of the DNN core that is installed and configured out of the box.
 
DotNetNuke Search consists of 4 main pieces:
  • Scheduled Task

The scheduled task initiates the process of indexing the modules, at the scheduled time interval. An iteration of all modules that support iSearchable is performed. During this process, text that is extracted from the module is cleaned, parsed, and added to search word and search items tables.

  • Search Admin

                    The search admin is for setting the maximum word length, minimum word length, option to include common words, and the option to include numbers. 

  • Search Input Module

A module or skin object can be used to provide the form for the search query. In module settings, you can use the default button, or an image. You do not have the option to change this image within the module, nor change the text. Styles can be used to make some look and feel changes, but it is limited. When a search is performed, the user is redirected to the Search Results page.

  • Search Results Module
This module provides the search results. In the settings, you can set the maximum search results, results per page, maximum title length, maximum description length, and the option to show description. Results are limited to the exact word queried.
 
Oddly enough, there no longer appears to be a DNN forum for search, or a blog dedicated to it on the DotNetNuke website. However, a good place to find out more about the core module is ecktwo’s site. There is a lot of information about how all the pieces work together, as well as the bugs/issues of DotNetNuke Search. There is also a tutorial and report on DNN Search for DNN 4.
 
Open-SearchEngine
Open-SearchEngine is developed by Xepient Solutions. The package is capable of indexing HTML content as well as PDF’s and several Office documents. Open-SearchEngine uses Lucene.Net, a port of the Java Lucene Search Engine, for indexing and querying.
 
Open-Search Engine consists of 4 main pieces:
  • Scheduled Task

Test The scheduled task initiates the process of spidering, at the scheduled time interval. Lucene.Net handles indexing of the data.

  • Search Engine Admin Module

This module provides an interface for configuring the search engine to your preferences. You can add a starting URL and by default, spidering is enabled. This allows you to offer multiple sites in your search engine. However, unless disabled, each time you run the process to update the index, all URLs are re-crawled. With many URLs on the site(s) you index, it can lead to a very long time between the completion of crawling and indexing runs.

  • Search Input Module 

A module or skin object can be used to provide the form for the search query. In module settings, you can use the default button, or an image. You also have the option to add “Search” as text or image before the textbox.

  • Search Results Module
This module provides the search results. In the settings, you can set which sites are part of the results scope, maximum results per page, maximum title length, title link target, and the option to hide description.
 
 
 
Venexus Search Engine
The Venexus Search Engine is quite different than the other 2 solutions. The package includes 2 modules and requires MS SQL Server Full-Text Indexing. Like traditional crawlers, VSE can crawl and index a variety of data, but where the real difference is seen is in it's ability to also “crawl” and index RSS feeds. This is the key to keeping the search results up-to-date, while conserving server and bandwidth resources. Rather than recrawling and reindexing all content, "smart caching" is used to determine when RSS feeds need to be aggregated, and when non-syndicated content needs to be recrawled on the site.
  
The Venexus Search Engine consists of 2 main pieces:
  • Seamus Module

The Seamus module is the “search engine aggregation module utilizing syndication”. On the first load of the module, Seamus iterates through the core DNN modules on all portals that support the iPortable interface. Seamus uses this “initial dump” to gather other URLs for the site. You also have the ability to add feeds to Seamus, not only for your site, but any external site. With “global crawler” enabled, any external site URLs that are discovered during crawling, are added to the queue as well.  Using AJAX, Seamus performs crawling of 3 feeds and 3 URLs with each load. If the user remains on the page, using AJAX, Seamus will continue to crawl and save the data to the table for indexing.  This decreases the load on the server by spreading the crawling and indexing across several user sessions, rather than a single scheduled task.

  • Search Module

The Search module provides the search box, as well as the results. Using Microsoft SQL Server’s feature of Full-Text Indexing, data is indexed from the crawling and storing provided by Seamus. Within the settings you can specify the search button text or use you own custom image for the button, set maximum search length, set search bx size, maximum results, results per page, set maximum length of display URL, specify remote connection string (database other than DNN), specify portal specific search, or allow user to select between site or all of the web search.

Feature Comparison Matrix:

Below you will find a list of features for DNN Search, Open-SearchEngine, Venexus Search Engine Standard, and Venexus Search Engine PRO.

Feature
DNN Search
Open-SearchEngine
Venexus Search Engine Standard
Venexus Search Engine PRO
Crawling Method
Module Indexer (Must implement iSearchable)
Custom URL crawler/spider (Must have starting URL for each site, with crawling enabled)
Custom Crawler
(Uses iPortable interface, traditional URL crawler/spider, and RSS aggregation)
Custom Crawler
(Uses iPortable interface, traditional URL crawler/spider, and RSS aggregation)
Crawl and Index Start
Requires DNN Scheduled Task
Requires DNN Scheduled Task
User Interactive (AJAX in aggregation module)
User Interactive (AJAX in aggregation module)
Global Crawler
No
No (Requires input of each domain)
No
Yes
DNN User Impersonation
No
Yes
No
No (Version 2.0)
Windows Authentication
No
Yes
No
No (Version 2.0)
Exclude List
No
Yes
Yes
Yes
Excel Documents
No
Yes
No
Yes
PDF Files
No
Yes
No
Yes
PowerPoints
No
Yes
No
Yes
RTF Files
No
No
No
Yes
Word Docs
No
Yes
No
Yes
Index File System
No
Yes
No
No (Version 2.0)
Index
Table Driven Index
Lucene.Net (flat file)
Full-Text Indexing in SQL Server (flat file)
Full-Text Indexing in SQL Server (flat file)
RSS
No
No
No
Yes
Enclosure Support (podcast/vodcast)
No
No
No
Yes
Feed Discovery
No
No
Yes
Yes
Smart Caching
No
No
Yes
Yes
Allow users to add feeds
No
No
No
Yes
Generates RSS Feed of latest items indexed
No
No
Yes
Yes
Blog and Feed Aggregator Pinging
No
No
No
Yes
Search Skin Object
Yes
Yes
Yes
Yes
Utilize DNN Search Skin
Yes
No
Yes
Yes
Modify search box and image
No
Yes
Yes
Yes
Use Image or Text for Search button
No
Yes
Yes
Yes
Portal(site) or Web search
No
No
Yes
Yes
Keyword Highlighting
No
Yes
Yes
Yes
Cached Version
No
No
No
No (Version 2.0)
User Saved Searches
No
No
No
No (Version 2.0)
Social Bookmarking
No
No
No
Yes
Price
Free
$49
Free
$199

Performance and Relevancy:

What about performance and the relevancy of the results? I setup a test site with 5 total pages of content and installed/configured DNN Search, Open-SearchEngine, and Venexus Search Engine on separate pages. I also installed PageGenerated module from Ventrian Systems to show page execution time. I am not sure of any accuracy for a benchmark here, but the following results are the best of 5 consecutive query executions against each search engine using "truman" without quotes as the search query. In reality, there are only 2 relevant pages associated with "truman". There is a link from the home page of the site with the text "Truman Doctrine" as a contextual link that directs the user to the full document about the "Truman Doctrine". Ideally, we should expect the document that is all about "truman" and his doctrine to be listed first:

DNN Search:

Best Execution Time: 0.218531 seconds

Results Returned: 1

Notes:

The only result returned is not the most relevant page on the site. In fact, the "Truman Doctrine" page is not even listed as a result. This must be because the word "truman" does not actually appear in the content of the text/html module on the Truman Doctrine page. There is "HARRY S. TRUMAN'S ADDRESS" in the content, but DNN Search can only return results where the query is spelled EXACTLY like something in the content.

Open-SearchEngine:
 

Best page Execution: 0.1093155 seconds

Returned Results: 10

Notes:

Notice the poor description and the fact that the true most relevant document (the "Truman Doctrine" page) is the 5th result. Also, there are several results of pages that have no information about "Truman" except for the link in the SolPartMenu. While it is good that the search engine is able to crawl the solpartmenu, it is unfortunate that the search engine weights pages that just have links in a menu higher than the most relevant result. The best page execution time was half that of DNN Search, which is excellent.

Venexus Search Engine:

Best Page Execution: 0.046866 seconds

Results Returned: 3

Notes:

Notice the first result is the actual document (the "Truman Doctrine" page)  we are looking for. Also, page execution time is less than half the time than Open-SearchEngine and a quarter of the time compared to DNN Search.

Conclusion:

The implementation provided by the DNN core team for the built-in DotNetNuke Search suits the needs for many smaller sites. However, larger sites will quickly run into issues with memory consumption due to the way the module indexing is performed. The search architecture is limited and greatly impacts the performance of the site and the search results due to the indexing process and the direct SQL table queries that holds the words and index. Most likely this is due to the requirement for database independence, rather than poor design. If your site is small, needs database independence, and search results are helpful, but not really an important piece of your site, then this may be the best tool for you.

If you are looking for a traditional search engine crawler, with good scalability, and you require database server independence, and decent search results, Open-SearchEngine may be the solution for you. It is by far better than the core DNN Search, but relies on tradional crawling and indexing methods. Conservation of bandwidth and server resources are debatable since there is no method of smart caching available. The ability for this engine to index direcories of files is an important feature than neither DNN Search, nor VSE offer. However, the lack of RSS aggregation as the new medium for crawling and gathering new and updated data is a huge issue that will lead to stagnant search results without frequently reindexing all URLs.  As evident from the simple search results analysis performed, most results are not really relevant, but it is better than not returning any true relevant results like DNN Search due to spelling differences. It just means your users will have plenty to click on before finding the correct document they are looking for. While execution time is certainly better than DNN Search, it is still significantly slower than the Venexus Search Engine execution time.

The Venexus Search Engine offers 2 versions, the standard (free version), and the Pro (not free version).  The standard version still offers many of the features smaller sites require, including quick and relevant results, but does not include some of the nicer features of the Pro version like PDF and MS Office document indexing and blog and feed aggregation pinging service. Where VSE really shines is in its ability to provide and aggregate RSS feeds for inclusion in its index. The smart caching and user interactive crawling using AJAX distributes the load on the server and bandwidth. The major advantage and disadvantage of VSE is MS SQL Server Full-Text indexing. The disadvantage is that VSE is NOT database independant and requires Full-Text indexing enabled versions of MS SQL Server in order to operate. The advantage is that it uses Full-Text Indexing from MS SQL Server for more relevant and faster search results. We know VSE is scalable because it has been tested against a database of over 2 million indexed pages. The simple search results analysis shows that it is 4 times faster than DNN Search and 2 times faster than Open-SearchEngine. The actual search results speak for themselves, delivering the most relevant result as #1 and contextual links from the home page as supplemental results.

Picking the right search engine application is important for your website and now you should be armed with the knowledge of how each one operates, the differences in features between them, and the overall performance and relevancy of the search results.

I hope this answers everyone's questions concerning the differences between the 3 DotNetNuke Search Engines. Feel free to comment with questions or suggestion on how this post can be improved. If you know of a feature or difference that I missed, please let me know. While this post is quite lengthy, I plan on keeping it updated as a resource for those who would like to keep track of the differences between each DNN search engine.

Sunday, April 08, 2007 6:37:38 PM (US Eastern Standard Time, UTC-05:00)  #       |  |  |  |   | 
 Saturday, April 07, 2007

DNN 4.5 was released today after a small delay last month. 

"A highly focused three month release cycle results in DotNetNuke® 4.5, a new release with integrated Microsoft ASP.NET AJAX support, a web-based installer, and a variety of other high value enhancements designed to improve the user experience." - DotNetNuke Enriches User Experience

I decided to try out a new install, before we perform any upgrades. So, after creating a new database, creating a new directory on the webserver for the site and changing permissions, setting up new site in IIS, and changing the web.config for the connection string, I tried loading the site. I had heard about the new installation wizard and was presented with the following:

I decided to do a custom installation so I could see all of the options.

One of the most common issues people have trying to install DotNetNuke is setting the file/folder permissions. It is nice to see the wizard test these.

Another big issue people have installing DotNetNuke is making sure the connection string is valid. The wizard also supports a database connection test.

Easy install so far....

Nice! You can now configure the host account instead of relying on the defaults.

 

You can also filter which modules get installed during installation.

And set admin user account and portal porperties instead of using the defaults.

You can also have optional skins and containers to be installed, if available.

And language packs if needed.

And done!

I am very impressed with the new installation wizard. It is good to see checks being performed that will undoubtedly decrease support issues for people who have not setup everything correctly.

After logging into the new site, the first thing I noticed was the different icons.I like the new and clean look of these icons. I also LOVE the new "Show Control Panel" dropdown option.

Something new too is the "Solution Explorer":

 

This is a convenient interface to the DNN Marketplace. The "DotNetNuke" and "About" tabs still appear to be in beta mode however. It should be interesting to see how this piece develops.

It is very exciting to see this new release, and I will post later how the upgrades go.

Saturday, April 07, 2007 10:49:15 PM (US Eastern Standard Time, UTC-05:00)  #       | 
 Tuesday, April 03, 2007

We did a search engine optimization (SEO) campaign for a client a while back that required us to setup a few new domains with blogging software. We used a variety of different blogging software  and noticed something very interesting...sites that performed pinging were crawled within 1-3 days and indexed within 2 weeks. All of the domains we used were brand new. In some cases, keywords that we were focusing on were in top 10 search results in 2 weeks as well. Needless to say, the client was happy.

So, we added a pinging feature as part of our DotNetNuke search engine module. Seamus, our search engine aggregation module, generates a RSS feed for the entire DNN portal. Anytime a tab/page is added or updated and Seamus finds this change, if the "Pinging Service" is enabled, it will ping 14 different blog and feed aggregation services.

So you are probably asking how does this work. Well, we send a "ping" to these aggregation services using XML-RPC. It simply tells these services that there is new content available on your site and to check the RSS feed. These services then will consume the feed and add it to their index. This provides a contextual link (assuming you actually use good titles for your pages) from these service sites directly to the content on your site. Pretty cool, huh?

Currently the following services are pinged with our module:

  1. BlogDigger.com
  2. BlogFlux.com
  3. Blogsearch.Google.com
  4. BlogRolling.com
  5. Bulkfeeds.net
  6. Feedburner.com
  7. Feedster.com
  8. IceRocket.com
  9. Pingomatic.com
  10. Syndic8.com
  11. Technorati.com
  12. My.Yahoo.com
  13. Weblogs.com
  14. Weblogalot.com

As a test, we dropped our search engine module on a new domain on 3/16/2007. With no links from external sites pointing to this domain, most if not all of the traffic was from our development at this point, averaging about 3-5 unique visitors a day (probably me, the client, and one or more of our developers). As of yesterday, the site had almost 150 unique visitors. Not bad for 2 weeks, considering we have done nothing else to this site other than add 10 total pages of content. This site is off to a great start and we have not even started a linking or submission campaign. See the graph below:

Also, checkout the following post on Marketing Pilgrim about faster indexing through pinging Google blog search that collaborates our results.

Ready to see the power of pinging blog and feed aggregators? Add a pinging service to your DNN site and watch the traffic roll in.

Tuesday, April 03, 2007 4:54:46 PM (US Eastern Standard Time, UTC-05:00)  #       |  |  |   | 
 Sunday, April 01, 2007

We issues a patch release today for the Venexus Search Engine, a DotNetNuke search engine module. This release has a few minor bug fixes and tweaks:

  • Catalog creation in SQLDataProvider file moved to Search module (you still have to run the VenexusSearch module SQLDataProvider file through Host > SQL during new installation, but not for those who are upgrading).
  • Queue Importance added to allow some URLs to be crawled sooner than others. If you are running a global search engine, we now give preference to certain domain extentions (.gov and .edu).
  • Stores single Robots.txt file for domain instead of historical. Previously we stored a new robots.txt for each weekly check if it was updated. Now we only store one.
  • Feed Title is now the portal/site name.
  • Added XML-RPC ping for BlogFlux inclusion in our blog/feed aggregator pinging service. We had to remove one aggregator and added BlogFlux to the list (14 total)

Get the free version of our DNN search module.

Sunday, April 01, 2007 9:32:08 PM (US Eastern Standard Time, UTC-05:00)  #       |  |  |   | 
 Thursday, March 22, 2007

There is nothing better to start off the day than having a client running into a 100% CPU utilization issue on their production SQL Server. Every few minutes, the server would spike up and hang there for a variable amount of time (15 seconds to several minutes). You can only imagine the flakiness of a website with SQL Server choking to death. There was nothing of value to point any fingers as to the culprit of this issue in the event logs for DNN (Admin > LogViewer)...none that we saw through a brief spot checking and filtering of event types (this was incredibly slow and seeing timeouts so we abandoned all hope of using DNN Admin/Host tools to find the problem).  And, there was not an alarming number of events actually logged in EventLog table. However, we have seen issues with performance that are usually resolved by clearing the Log Viewer. We have seen cases where clients who have high traffic/usage sites, or a broken/problematic module on all pages, have 5 and 6 figure rows of data for EventLog table, especially if all the default settings are used for the DNN Log Viewer settings. We have seen timeout issues just trying to clear the event log when they get that large ("Delete EventLog" as the sql statement does the trick quickly from SSMS). So, we went ahead and cleared it, but the issue persisted.

For those who have not explored much in SQL Server Management Studio (not in SSMS Express), there is now a Database Engine Tuning  Advisor and SQL Server Profiler (under Tools > SQL Server Profiler). Running the SSP, we performed a trace and caught the offending SQL causing all of the havoc. Just a note...we have run DETA to find recommendations from trace files for several large DNN databases and apply the recommendations (it usually creates new indexes for tables that have 6 and 7 figure rows, helping greatly with performance on databases). But in this case, we just started and stopped the trace in SSP before and after a huge and hanging spike. Going through the rows looking for CPU hits, we found the following 2 villians of resources:

GetSchedule @Server='SERVERNAME'

GetScheduleNextTask @Server='SERVERNAME'

Running these statements showed the huge spike on command, pegging the server hard. Looking in the stored proc it hits Schedule and ScheduleHistory.

ALTER PROCEDURE [dbo].[GetSchedule]

@Server varchar(150)

AS

SELECT S.ScheduleID, S.TypeFullName, S.TimeLapse, S.TimeLapseMeasurement, S.RetryTimeLapse, S.RetryTimeLapseMeasurement, S.ObjectDependencies, S.AttachToEvent, S.RetainHistoryNum, S.CatchUpEnabled, S.Enabled, SH.NextStart, S.Servers

FROM Schedule S

LEFT JOIN ScheduleHistory SH

ON S.ScheduleID = SH.ScheduleID

WHERE (SH.ScheduleHistoryID = (SELECT TOP 1 S1.ScheduleHistoryID FROM ScheduleHistory S1 WHERE S1.ScheduleID = S.ScheduleID ORDER BY S1.NextStart DESC)

OR SH.ScheduleHistoryID IS NULL)

AND (@Server IS NULL or S.Servers LIKE ',%' + @Server + '%,' or S.Servers IS NULL)

GROUP BY S.ScheduleID, S.TypeFullName, S.TimeLapse, S.TimeLapseMeasurement, S.RetryTimeLapse, S.RetryTimeLapseMeasurement, S.ObjectDependencies, S.AttachToEvent, S.RetainHistoryNum, S.CatchUpEnabled, S.Enabled, SH.NextStart, S.Servers

In ScheduleHistory we found a little over 6 thousand rows. You can use the following to check your db:

select count(*) from schedulehistory

6000+ does not seem like that many rows to be causing that much of a peak, but regardless we deleted them all getting desperate at this point:

delete schedulehistory

Executing the 2 sprocs again for the schedule, and cpu barely gets over 3% utilization. The site is again fast and responsive and I was able to get in and check settings without getting timeouts. So, as an interim fix I lowered the defaults in  DotNetNuke.Services.Scheduling.PurgeScheduleHistory under Host > Schedule.

I am concerned about why 6000 rows of data would be taking such a hit on cpu resources. However, that it more records than I believe should be there, so lowering the defaults will help. Nothing in the DNN stored procedure for GetSchedule really stands out at me as being problematic, nor at first glance do I see anything that could be changed that may help, but I will ponder on this some more in my copious spare time.

So, if you are having trouble with SQL Server performance and DNN, check and make sure you keep your EventLog and ScheduleHistory purged.

If you need help, be sure to checkout our DNN Support Packages.

Thursday, March 22, 2007 1:12:18 PM (US Eastern Standard Time, UTC-05:00)  #       |  |   | 
 Wednesday, March 14, 2007

We ran into an issue parsing some HTML for a DNN content migration project we are working on. We needed to find the actual content of the page, without all of the look and feel. Luckily we found a pretty solid case of an opening and closing div tag that wrapped the entire content of the page. At first we had a basic regular expression for finding the div tags like so:

Dim sRegEx As String = "<div align=" & Chr(34) & "center" & Chr(34) & "[\d\D]*?\</div>"

This worked fine until we ran into some code that had div tags withing the div tags. The following shows you what the reg ex returns.

Example:

<div align="center">

Some text here

<div> this is between another div</div>

Here is more text that should be in the content we are ripping.

</div>

After some digging, the following regex does the trick:

Dim regexp As Regex = New Regex( _

"(<[^>]*?div[^>]*?(?:center)[^>]*>)((?:.*?(?:<[ \r\t]*div[^>]*>?.*?(?:<.*?/.*?div.*?>)?)*)*)(<[^>]*?/[^>]*?div[^>]*?>.*</div>)", _

RegexOptions.IgnoreCase _

Or RegexOptions.Singleline _

)

Wednesday, March 14, 2007 1:59:48 AM (US Eastern Standard Time, UTC-05:00)  #       |   | 
 Tuesday, March 06, 2007

We released the latest version of our search engine module last week. It has all of the features I mentioned in my previous post, plus the ability to add excluded URLs and partial URLs.

Here are the new specs:

Items marked with * are new

Features
Standard Version
Pro Version
Seamus Features    
Maximum # of Pages
500
Unlimited
Install on commercial site
No
Yes
Scheduled Index Updates
Yes
Yes
Announcements Module Support
Yes
Yes
Contacts Module Support
Yes
Yes
Events Module Support
Yes
Yes
FAQ Module Support
Yes
Yes
Links Module Support
Yes
Yes
Text/HTML Module Support
Yes
Yes
Index MS Excel Documents *
No
Yes
Index MS PowerPoint Documents *
No
Yes
Index MS Word Documents *
No
Yes
Index PDF Documents *
No
Yes
Index Rich Text Files *
No
Yes
Global Crawler *
No
Yes
Allows users to add feeds
No
Yes
Custom User Agent
No
Yes
Obeys Robots.txt
Yes
Yes
TTL Support
Yes
Yes
Feed and Queue Aggregation Using AJAX
Yes
Yes
Display Top X Latest Items
Yes
Yes
XSLT Support
Yes
Yes
Latest Items RSS Feed Generation
Yes
Yes
Portal Specific Feed
Yes
Yes
Enclosure/Podcast Support
No
Yes
Pinging Service
No
Yes
Exclude URLs *
Yes
Yes
     
Search Features    
Search Skin Object
Yes
Yes
Use Image or Text for Search button
Yes
Yes
and - (AND and OR) Support
Yes
Yes
Quoted Search Support
Yes
Yes
Keyword Highlighting
Yes
Yes
Obeys DNN Security
Yes
Yes
Social Bookmarking Support *
No
Yes
     
Support    
Issue Tracker
Yes
Yes
Email
No
Yes
Phone
No
1 Call
  Price  
Free
 
$199 Per Year
BuyNowButton.gif
 

You can download the free version here.

Tuesday, March 06, 2007 11:25:17 AM (US Eastern Standard Time, UTC-05:00)  #       |  |  |  |   | 
 Monday, February 26, 2007

The Venexus Search Engine is a DotNetNuke search module, plus a whole lot more. Not only does our DNN search module index your portal, but also external sites. VSE crawls pages on your site, aggregates RSS feeds from other sites, and crawls any links for external websites, making it a full search engine module.  Unlike the core DNN Search module that uses a scheduled task to perform index updates, VSE crawls and indexes content based on user request. Seamus can be configured for several different setups and displays, including the ability to hide the module on every page. When a page is loaded that has the Seamus module on it, Seamus will go out and grab 3 RSS feeds and 3 queued URLs and add any new or updated content to the index. In order to not delay the page loading for the end user, Seamus utilizes AJAX to make aggregation request, providing a seamless integration into your site.

 

Here are the pro features of VSE:

 

1.1 Pro Features

  • Allow users to add their feeds

You can enable users to add feeds to the system.

 

  • Podcast Support

Indexed items that have files associated with them are used as enclosures (aka podcast) in the feeds that Seamus generates.

 

  • Pinging service

When the pinging service is enabled, every time something new is added/updated on your site Seamus will “ping” several XML-RPC  web services for blog and feed aggregation sites to notify them your portal has new content. The aggregators will then come to your site and aggregate your feed and provide links to their users to your site.

 

Here are a few services we ping:

 

  • Custom User-Agent

You can set your own user-agent to specify your own crawler name. The default user-agent is “Seamus/1.1 PRO ( http://search.venexus.com)”.

 

  • Global Crawler

The pro version allows you to be a global crawler. Any links found on your site, from aggregated news feeds, or from external links are crawled and indexed.

 

So, not only are you able to aggregate even more content with the Pro version when compared to the Standard version, but you also get the search engine optimization benefits of pinging all of the major blog and feed aggregation services. This provides you with links directly to your site, generating more web traffic. You can watch your page rank grow very quickly with this feature.

 

Since the release of the 1.1 version, we have steadily been working on the 1.2 version. We are now testing the latest version on our demo site: search.venexus.com.

 

1.2 Pro Features (March 1, 2007 Release)

  • New file formats indexed

You asked for it, so we added support for all of the most common Office document file types as well as PDF documents. We have added a new document-to-text converter to our crawler that is able to parse the actual text from these documents. So not only does Seamus crawl and index HTML, Text, and XML files, but also the following new formats:

1.     Excel files

2.     PDF files

3.     PowerPoint files

4.     Rich text files

5.     Word documents

 

  • Social Bookmarking Support

In the search results you can enable social bookmarking to allow users to easily add bookmarks to their favorite social bookmarking application/service. This allows user to easily find their favorite links to your site. Also, the sites that provide this service will generate a link to your site, giving you more traffic once again.

 

Here is an example of what it looks like:

Here are the supported sites:

 

1.     Digg

2.     del.icio.us

3.     FURL

4.     Reddit

5.     Yahoo

6.     Blinklist

7.     Google

8.     ma.gnolia

9.     Shadows

10.  Technorati

 

Ready for a real search engine for your site? Buy the Pro version here.

 

Stay tuned for more…

Monday, February 26, 2007 5:04:36 AM (US Eastern Standard Time, UTC-05:00)  #       |  |  |  |   | 
 Wednesday, February 21, 2007

We released the Pro version of our DNN search engine module today.

Here is the breakdown of the feature comparison:

Venexus Search Engine Version Matrix

Features
Standard Version
Pro Version
Seamus Features
 
 
Maximum # of Pages
500
Unlimited
Install on commercial site
No
Yes
Scheduled Index Updates
Yes
Yes
Announcements Module Support
Yes
Yes
Contacts Module Support
Yes
Yes
Events Module Support
Yes
Yes
FAQ Module Support
Yes
Yes
Links Module Support
Yes
Yes
Text/HTML Module Support
Yes
Yes
Allows users to add feeds
No
Yes
Custom User Agent
No
Yes
Obeys Robots.txt
Yes
Yes
TTL Support
Yes
Yes
Feed Aggregation Using AJAX
Yes
Yes
Display Top X Latest Items
Yes
Yes
XSLT Support
Yes
Yes
Latest Items RSS Feed Generation
Yes
Yes
Portal Specific Feed
Yes
Yes
Enclosure/Podcast Support
No
Yes
Pinging Service
No
Yes
 
 
 
Search Features
 
 
Search Skin Object
Yes
Yes
Use Image or Text for Search button
Yes
Yes
+ and - (AND and OR) Support
Yes
Yes
Quoted Search Support
Yes
Yes
Keyword Highlighting
Yes
Yes
Obeys DNN Security
Yes
Yes
 
 
 
Support
 
 
Issue Tracker
Yes
Yes
Email
No
Yes
Phone
No
1 Call
 
 
 
Price
Free
$199 Per Year
BuyNowButton.gif

I will be discussing the features of the Pro version in a later post. Stay tuned...

Wednesday, February 21, 2007 2:52:51 PM (US Eastern Standard Time, UTC-05:00)  #       |  |  |  |   | 
 Monday, February 19, 2007

We have released the new version of the Venexus Search EngineVSE Standard Version 1.1.0 has several bug fixes and shows some of the new features of the Pro version.

New standard features and bug fixes:

  • VenexusSeamus - Changed TransformXSL to not create a temporary XML file
  • VenexusSeamus - Modified Response.Charset
  • VenexusSeamus - New Delete Tabs routine for removing deleted and expired tabs
  • VenexusSeamus - Ability to reload default XSLT file
  • VenexusSeamus - Shows total number of aggregated items
  • VenexusSeamus - Gridview pagination
  • VenexusSeamus - Link from Grid to show aggregation errors
  • VenexusSeamus - Guid attribute added
  • VenexusSeamus - application/rss+xml support
  • VenexusSeamus - Automatic creation of fulltext index during installation (works for SQL Server Express too!)
  • VenexusSearch - Support for DNN 4.4.1 and "search" URL parameter
  • VenexusSearch - Non-authenticated postback issue resolved
  • VenexusSearch - Limits URL length for display
  • VenexusSearch - Quoted query support

If you have any issues with installation, configuration, or bugs, pleas post them in our issue tracker.

Monday, February 19, 2007 5:17:05 PM (US Eastern Standard Time, UTC-05:00)  #       |  |  |  |   | 
 Wednesday, February 14, 2007

Here is a video tutorial on setting up SQL Server 2005 Express and Full-Text Indexing. It breaks down the steps for installation of SQL Server Express with Advanced Services. This is a great video that shows alot more than just setting up full-text indexing. It also shows some basic queries.

Key points of interest during installation is when you get to the Registration Information screen, uncheck "Hide advanced configuration options" before clicking Next. Then in the next screen, expand Database Services and select the option to add "entire feature will be installed on local hard drive" for Full-Text Search. After a few more steps, you must uncheck User Instances Enabled. For those who already have Full-Text Search installed, but did not uncheck that option, you can use the following SQL:

sp_configure 'user instances enabled', '0'

If you are using SQL Server Express Management Studio Express, you can go into the database properties and under files, make sure enable full-text indexing is checked. Or, run the following SQL:

sp_fulltext_database 'enable'

Now for creating the catalog and index. The example below is for our search engine module:

Create fulltext catalog VenexusSearchCatalog

Create Unique Index PKVenexusSearchEngine On Venexus_BrainDump(IndexID)

Create fulltext index On Venexus_BrainDump (IndexURL, IndexTitle, IndexWashedContent)
Key Index PKVenexusSearchEngine On VenexusSearchCatalog
With Change_Tracking Auto


 

Wednesday, February 14, 2007 2:25:21 AM (US Eastern Standard Time, UTC-05:00)  #       |   | 
Copyright © 2010 Venexus, Inc.. All rights reserved.